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NOVEL PLANT ACYLTRANSFERASES 

5 

INTRODUCTION 

This application claims the benefit of U.S. Provisional Application Serial No. 
60/101,939 filed September 25, 1998. 

10 

Technical Field 

The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 

15 Background 

Through the development of plant genetic engineering techniques, it is now possible to 
produce transgenic varieties of plant species to provide plants which have novel and desirable 
characteristics. For example, it is now possible to genetically engineer plants for tolerance to 
environmental stresses, such as resistance to pathogens and tolerance to herbicides and to 

2 0 improve the quality characteristics of the plant, for example improved fatty acid compositions. 

However, the number of useful nucleotide sequences for the engineering of such 
characteristics is thus far limited and the speed with which new useful nucleotide sequences 
for engineering new characteristics is slow. 

The characterization of various acyltransferase proteins is useful for the further study 
25 of plant fatty acid synthesis systems and for the development of novel and/or alternative oils 
sources. Studies of plant mechanisms may provide means to further enhance, control, 
modify, or otherwise alter the total fatty acyl composition of triglycerides and oils. 
Furthermore, the elucidation of the factor(s) critical to the natural production of fatty acids in 
plants is desired, including the purification of such factors and the characterization of 

3 0 element(s) and/or cofactors which enhance the efficiency of the system. Of particular interest 

are the nucleic acid sequences of genes encoding proteins which may be useful for 
applications in genetic engineering. 
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SUMMARY OF THE INVENTION 

The present invention provides nucleic acid encoding for amino acid 
sequences for a class of proteins which are related to acyl transferase proteins. Such proteins 
are referred to herein as acyltransferase related or acyltransferase like proteins. 

By this invention, nucleic acid sequences encoding these acyltransferase related 
proteins may now be characterized with respect to enzyme activity. In particular, 
identification and isolation of nucleic acid sequences encoding for acyltransferase related 
proteins from Arabidopsis, yeast, corn, and soybean are provided. 

Thus, this invention encompasses acyltransferase related nucleic acid sequences and 
the corresponding amino acid sequences, and the use of these nucleic acid sequences in the 
preparation of oligonucleotides containing such acyltransferase related encoding sequences 
for analysis and recovery of plant acyltransferase related gene sequences. The acyltransferase 
related encoding sequence may encode a complete or partial sequence depending upon the 
intended use. All or a portion of the genomic sequence, or cDNA sequence, is intended. 

Of special interest are recombinant DNA constructs which provide for transcription or 
transcription and translation (expression) of the acyltransferase related sequences in host 
cells. In particular, constructs which are capable of transcription or transcription and 
translation in plant host cells are preferred. For some applications a reduction in sequences 
encoding acyltransferase related sequences may be desired. Thus, recombinant constructs 
may be designed having the acyltransferase related sequences in a reverse orientation for 
expression of an anti-sense sequence or use of co-suppression, also known as "transwitch", 
constructs may be useful. Such constructs may contain a variety of regulatory regions 
including transcriptional initiation regions obtained from genes preferentially expressed in 
plant seed tissue. For some uses, it may be desired to use the transcriptional and translational 
initiation regions of the acyltransferase related gene either with the acyltransferase related 
encoding sequence or to direct the transcription and translation of a heterologous sequence. 

Also considered in this invention are the plants and seeds containing the constructs 
and polynucleotides of this invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides the 204 amino acid conserved sequence profile identified from 
comparisons of glycerol-3-phosphate acyltransferase and various lysophosphatidic acid 

acyltransferase using PSI-BLAST. 

Figure 2 provides an amino acid sequence alignment for the acyltransferase 
sequences. The alignment shown is of the regions of the protein extending from about 30 
amino acids prior to the conserved H in the conserved sequence HXXXXD to 100 amino 
acids after, or downstream, of the P in the conserved PEG sequence motif of the 
acyltransferase-like sequences. 

Figure 3 provides schematics showing the relationship of the identified 
acyltransferases. The relationships described are derived from an alignment of the regions of 
the protein extending from about 30 amino acids prior to the conserved H in the conserved 
sequence HXXXXD to 100 amino acids after, or downstream, of the P in the conserved PEG 
sequence motif of the acyltransferase-like sequences. Figure 3A provide aphylogenetic tree 
showing the relationship of several acyltransferases. Figure 3B provides a table showing the 
percent similarities and percent divergence of the novel acyltransferases and known 
acyltransferases using the Clustal method with PAM250 residue weight table. 



DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the subject invention, nucleotide sequences are provided which are 
capable of coding sequences of amino acids, such as, a protein, polypeptide or peptide, which 
are related to nucleic acid sequences encoding acyltransferase proteins, referred to herein as 
acyltransferase-like or acyltransferase related. The novel nucleic acid sequences find use in 
the preparation of constructs to direct their expression in a host cell. Furthermore, the novel 
nucleic acid sequences may find use in the preparation of plant expression constructs to 
modify the fatty acid composition of a plant cell. 

In one embodiment of the present invention, nucleic acid sequences, also referred to 
herein as polynucleotides, are identified from databases which are related to acyltransferases. 
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Isolated proteins, Polypeptides and Polynucleotides 

A first aspect of the present invention relates to isolated acyltransferase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 
polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to 
each coding sequence as set forth in the Sequence Listing. The invention also provides the 
coding sequence for the mature polypeptide or a fragment thereof, as well as the coding 
sequence for the mature polypeptide or a fragment thereof in a reading frame with other 
coding sequences, such as those encoding a leader or secretory sequence, a pre-, pro-, or 
prepro- protein sequence. The polynucleotide can also include non-coding sequences, 
including for example, but not limited to, non-coding 5' and 3' sequences, such as the 
transcribed, untranslated sequences, termination signals, ribosome binding sites, sequences 
that stabilize mRNA, introns, polyadenylation signals, and additional coding sequence that 
encodes additional amino acids. For example, a marker sequence can be included to facilitate 
the purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated sequences 
that control gene expression. 

The invention also includes polynucleotides of the formula: 
X-(R,) n -(R 2 )-(R3)n-Y 

wherein, at the 5' end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, R, and R 3 
are any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 
1000 and R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence 
selected from the group set forth in the Sequence Listing and preferably SEQ IDNOs: 1, 3, 5, 
7, 9, 10, 12, 14, 16, 18, 20, 22, and 226-233. In the formula, R 2 is oriented so that its 5' end 
residue is at the left, bound to R,, and its 3' end residue is at the right, bound to R 3 . Any 
stretch of nucleic acid residues denoted by either R group, where R is greater than 1, may be 
either a heteropolymer or a homopolymer, preferably a heteropolymer. 

The invention also relates to variants of the polynucleotides described herein that 
encode for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
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invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 
5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the 
invention are substituted, added or deleted, in any combination. Particularly preferred are 
substitutions, additions, and deletions that are silent such that they do not alter the properties 
or activities of the polynucleotide or polypeptide. 

Nucleotide sequences encoding acyltransferases may be obtained from natural sources 
or be partially or wholly artificially synthesized. They may directly correspond to an 
acyltransferase endogenous to a natural source or contain modified amino acid sequences, 
such as sequences which have been mutated, truncated, increased or the like. Acyltransferases 
may be obtained by a variety of methods, including but not limited to, partial or homogenous 
purification of protein extracts, protein modeling, nucleic acid probes, antibody preparations 
and sequence comparisons. Typically an acyltransferase will be derived in whole or in part 
from a natural source. A natural source includes, but is not limited to, prokaryotic and 
eukaryotic sources, including, bacteria, yeasts, plants, including algae, and the like. 

Of special interest are acyltransferases which are obtainable from eukaryotic sources, 
including those which are obtained, from plants, or from acyltransferases which are 
obtainable through the use of these sequences. "Obtainable" refers to those acyltransferases 
which have sufficiently similar sequences to that of the sequences provided herein to provide 
a biologically active protein of the present invention. 

Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 
identical over their entire length to a polynucleotide encoding a polypeptide of the invention, 
and polynucleotides that are complementary to such polynucleotides. More preferable are 
polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that are 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
those with at least 97% identity are highly preferred and those with at least 98% and 99% 
identity are particularly highly preferred, with those at least 99% being the most highly 
preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that retain 
substantially the same biological function or activity as the mature polypeptides encoded by 
the polynucleotides set forth in the Sequence Listing. 



PCT/US99/22231 

WO 00/18889 6 

The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under 
stringent conditions to the above-described polynucleotides. As used herein, the terms 
"stringent conditions" and "stringent hybridization conditions" mean that hybridization will 
generally occur if there is at least 95% and preferably at least 97% identity between the 
sequences. An example of stringent hybridization conditions is overnight incubation at 42°C 
in a solution comprising 50% formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 
50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 
micrograms/milliliter denatured, sheared salmon sperm DNA, followed by washing the 
hybridization support in 0.1 x SSC at approximately 65°C. Other hybridization and wash 
conditions are well known and are exemplified in Sambrook, ex al, Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989), particularly Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a 
polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under stringent 
hybridization conditions with a probe having the sequence of said polynucleotide sequence or 
a fragment thereof; and isolating said polynucleotide sequence. Fragments useful for 
obtaining such a polynucleotide include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA, or 
genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 
15 bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 
Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a polynucleotide 
sequence set forth in the Sequence Listing may be isolated by screening using a DNA 
sequence provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then 
used to screen a library of cDNA, genomic DNA or mRNA to identify members of the library 
which hybridize to the probe. For example, synthetic oligonucleotides are prepared which 
correspond to the N-terminal sequence of the polypeptide. The partial sequences so prepared 
can then be used as probes to obtain acyltransferase clones from a gene library prepared from 
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a cell source of interest. Alternatively, where oligonucleotides of low degeneracy can be 
prepared from particular peptides, such probes may be used directly to screen gene libraries 
for gene sequences. In particular, screening of cDNA libraries in phage vectors is useful » 
such methods due to lower levels of background hybridization. 

Typically, a sequence obtainable from the use of nucleic acid probes will show 60- 
70% sequence identity between the target acyltransferase sequence and the encoding sequence 
used as a probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic acid 
sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic acid 
fragments are employed as probes (greater than about 100 bp), one may screen at lower 
stringencies in order to obtain sequences from the target sample which have 20-50% 
deviation (i.e., 50-.80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid sequence 
encoding an acyltransferase enzyme, but should be at least about 10, preferably at least about 
15 and more preferably at least about 20 nucleotides. A higher degree of sequence identity » 
desired when shorter regions are used as opposed to longer regions. It may thus be desirable 
to identify regions of highly conserved amino acid sequence to design oligonucleotide probes 
for detecting and recovering other related genes. Shorter probes are often particularly useful 
for polymerase chain reactions (PCR), especially when highly conserved sequences can be 
identified. (See, Gould, et al, PNAS USA (1989) 56:1934-1938). 

The skilled artisan will appreciate that, in many cases, an isolated cDNA sequence 
will be incomplete, in that the region coding for the polypeptide is truncated with respect to 
the 5' terminus of the cDNA. This is a consequence of the reverse transcriptase, an enzyme 
with low 'processivity' (a measure of the ability of the enzyme to remain attached to the 
template during the polymerization reaction) employed during the first strand cDNA 
synthesis. 

There are several methods available and are well know to the skilled artisan to obtain 
full-length cDNAs, or extend short cDNAs, for example those based on the method of Rapid 
Amplification of cDNA Ends (RACE) (see, for example, Frohman et al. (1988) Proc. Natl. 
Acad. Sci. USA 85:8998-9002). Recent modifications of the technique, exemplified by the 
Marathon™ technology (Clonetech Laboratories, Inc.) for example, have significantly 
simplified obtaining full-length cDNA sequences. 
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Another aspec. of .he present invention relates to isolated aeyltransferase 
polypeptides. Such polypeptides inelnde isolated polypeptides se, forth in the Sequence 
Listing, as well as polypeptides and fragments .hereof, particularly those polypeptides wh,eh 
exhibit aeyltransferase activity and also those polypeptides which have a, leas, 50%, 60% or 
70% identity, preferably a, leas. 80% identity, tnore preferably at leas. 90% identity, and mos. 
preferably at leas. 95% idemhy .o a polypeptide sequence selec.ed from the group of 
sequences se, forth in the Sequence Listing, and also include portions of such polypeptides, 
wherein such portion of ,he polypeptide pteferably includes a, leas, 30 amino acids and more 
preferably includes at least 50 amino acids. 

"Identity", as is well understood in the art, is a relationship between two or more 
polypeptide sequences or two or more polynucleotide sequences, as determined by comparing 
the sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but not limited 
to those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University 
Press New York (1988); Biocomputing: Informatics and Genome Projects, Smith. D.W.. ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin. 
A M. and Griffin, H.G., ed... Humana Press, New Jersey (1994); Sequence Analysis tn 
Molecular Biology, von Heinje, G.. Academic Press (1987); Sequence Analysis Pnmer, 
Gribskov, M. and Devereux, J., ed,, Stockton Press, New York (1991); and Carillo, H., and 
Lipman, D.. SIAM J Applied Math, 48: 1073 (1988). Methods to determine identity are 
designed to give the largest match between the sequences tested. Moreover, methods to 
determine identity are codified in publicly available programs. Computer programs which 
can be used to determine identity between two sequences include, but are not limited to, GCG 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); suite of five BLAST 
programs, three designed for nucleotide sequences queries (BLASTN, BLASTX, and 
TBLASTX) and two designed for protein sequence queries (BLASTP and TBLASTN) 
(Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, et al, Genome Analysis, 1: 
543-559 (1997)). The BLAST X program is publicly available from NCBI and other sources 
, (BLAST Manual, AUschul, S., et al, NCBI NLM NIH, Bethesda, MD 20894; Altschul, S., et 
al, J. Mol Biol, 215:403-410 (1990)). The well known Smith Waterman algorithm can also 
be used to determine identity. 
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Parameters for polypeptide sequence comparison typically include the following: 
Algorithm: Needleman and Wunsch, J. MoL Biol. 48:443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. 
Sci USA 89:10915-10919 (1992) 
Gap Penalty: 12 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters along 
with no penalty for end gap are the default parameters for peptide comparisons. 

Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 

Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 

Gap Length Penalty: 3 

A program which can be used with these parameters is publicly available as the "gap- 
program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 

default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 

X-(Ri)„-(R 2 )-(R3)n-Y 

wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or 
a metal, R, and R 3 are any amino acid residue, n is an integer between 1 and 1000, and R 2 » 
an amino acid sequence of the invention, particularly an amino acid sequence selected from 
the group set forth in the Sequence Listing and preferably SEQ IDNOs: 2, 4, 6, 8, 1 1, 13, 15, 
17, 19, 21, 23, and 218-225. In the formula, R 2 is oriented so that its amino terminal residue 
is It the left, bound to R„ and its carboxy terminal residue is at the right, bound to R 3 . Any 
stretch of amino acid residues denoted by either R group, where R is greater than 1, may be 
either a heteropolymer or a homopolymer, preferably a heteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in 
) SEQ IDNOs: 1,3,5,7,9, 10, 12, 14, 16, 18, 20, 22, and 226-233. 

The polypeptides of the present invention can be mature protein or can be part of a 

fusion protein. 
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Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is 
entirely the same as part but not all of the amino acid sequence of the previously described 
polypeptides. The fragments can be "free-standing" or comprised within a larger polypeptide 
of which the fragment forms a part or a region, most preferably as a single continuous region. 
Preferred fragments are biologically active fragments which are those fragments that mediate 
activities of the polypeptides of the invention, including those with similar activity or 
improved activity or with a decreased activity. Also included are those fragments that 
antigenic or immunogenic in an animal, particularly a human. 

Variants of the polypeptide also include polypeptides that vary from the sequences set 
forth in the Sequence Listing by conservative amino acid substitutions, substitution of a 
residue by another with like characteristics. In general, such substitutions are among Ala, 
Val, Leu and He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between 
Lys and Arg; or between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 
to 5; 1 to 3 or one amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, these 
variants can be used as intermediates for producing the full-length polypeptides of the 
invention. 

The polynucleotides and polypeptides of the invention can be used, for example, in 
the transformation of various host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the 
mature polypeptide (for example, when the mature form of the protein has more than one 
polypeptide chain). Such sequences can, for example, play a role in the processing of a 
protein from a precursor to a mature form, allow protein transport, shorten or lengthen protein 
half-life, or facilitate manipulation of the protein in assays or production. It is contemplated 
that cellular enzymes can be used to remove any additional amino acids from the mature 
protein. 

A precursor protein, having the mature form of the polypeptide fused to one or more 
prosequences may be an inactive form of the polypeptide. The inactive precursors generally 
are activated when the prosequences are removed. Some or all of the prosequences may be 
removed prior to activation. Such precursor protein are generally called proproteins. 
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The polynucleotide and polypeptide sequences can also be used to identify additional 
sequences which are homologous to the sequences of the present invention. The most 
preferable and convenient method is to store the sequence in a computer readable medium, 
for example, floppy disk, CD ROM, hard disk drives, external disk drives and DVD, and then 
to use the stored sequence to search a sequence database with well known searching tools. 
Examples of public databases include the DNA Database of Japan 
(DDBJ)(http://www.ddbj.nig.ac.jp/); Genebank 

(hu p^^www nchi.nlrr , nih ^wK/n..nhnnk/lndex.ht«nll; and the European Molecular 
Biology Laboratory Nucleic Acid Sequence Database (EMBL) 

QlU£ V Zw ^bi ac uk/eb] docs/embl dK htrnl}. A number of different search algorithms are 
available to the skilled artisan, one example of which are the suite of programs referred to as 
BLAST programs. There are five implementations of BLAST, three designed for nucleot.de 
sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein 
sequence queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 
(1994); Birren, et aL, Genome Analysis, J: 543-559 (1997)). Additional programs are 
available in the art for the analysis of identified sequences, such as sequence alignment 
programs, programs for the identification of more distantly related sequences, and the hke, 
and are well known to the skilled artisan. 

Plant Constructs and Methods of Use 

Of interest in the present invention, is the use of the nucleotide sequences, or 
polynucleotides, in recombinant DNA constructs to direct the transcription or transcription 
and translation (expression) of the acyltransferase sequences of the present invention in a host 
cell. 

Of particular interest is the use of the nucleotide sequences, or polynucleotides, in 
recombinant DNA constructs to direct the transcription or transcription and translation 
(expression) of the acyltransferase sequences of the present invention in a host cell. The 
expression constructs generally comprise a promoter functional in a host cell operably linked 
, to a nucleic acid sequence encoding an acyltransferase of the present invention and a 
transcriptional termination region functional in a host cell. 

By "host cell" is meant a cell which contains a vector and supports the replication, 
and/or transcription or transcription and translation (expression) of the expression construct. 
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Host cells for use in the present invention can be prokaryotic cells, such asE. coli, or 
eukaryotic cells such as yeast, plant, insect, amphibian, or mammalian cells. Preferably, host 
cells are monocotyledenous or dicotyledenous plant cells. 

Of particular interest in the present invention is the use of the polynucleotides of the 
present invention for the preparation of constructs to direct the transcription or transcription 
and translation of the nucleotide sequences encoding an acyltransferase in a host plant cell. 
Plant expression constructs generally comprise a promoter functional in a plant host cell 
operably linked to a nucleic acid sequence of the present and a transcriptional termination 
region functional in a host plant cell. 

Those skilled in the art will recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of promoters are constitutive promoters such as the CaMV35S or FMV35S 
promoters that yield high levels of expression in most plant organs. Enhanced or duplicated 
versions of the CaMV35S and FMV35S promoters are useful in the practice of this invention 
(Odell, ex al. (1985) Nature 313:810-812; Rogers, U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression of the protein of interest in 
specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter 
chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant 
seed tissue. Examples of such seed preferential transcription initiation sequences include 
those sequences derived from sequences encoding plant storage protein genes or from genes 
involved in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' 
regulatory regions from such genes as napin (Kridl ex al., Seed Sci. Res. 7:209:219 (1991)), 
phaseolin, zein, soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit 
of p-conglycinin (soy 7s, (Chen et al., Proc. Natl. Acad. Sci., 83:8560-8564 (1986))) and 
oleosin. 

It may be advantageous to direct the localization of proteins conferring acyltransferase 
to a particular subcellular compartment, for example, to the mitochondrion, endoplasmic 
reticulum, vacuoles, chloroplast or other plastidic compartment. For example, where the 
genes of interest of the present invention will be targeted to plastids, such as chloroplasts, for 
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expression, the constructs will also employ the use of sequences to direct the gene to the 
plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
transit peptides (PTP). In this manner, where the gene of interest is not directly inserted into 
the plastid, the expression construct will additionally contain a gene encoding a transit 
peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 
derived from the gene of interest, or may be derived from a heterologous sequence having a 
CTP. Such transit peptides are known in the art. See, for example, Von Heijne et al. ( 1991) 
Plant Mol. Biol Rep. 9:104-126; Clark et al. (1989) J. Biol. Chem. 264: 17'544- 17550; della- 
Cioppa et al. (1987) Plant Physiol. 54:965-968; Romer et al. (1993) Biochem. Biophys. Res 
Commun. 796:1414-1421; and, Shah et al. (1986) Science 253:478-481. Additional transit 
peptides for the translocation of the protein to the endoplasmic reticulum (ER), or vacuole 
may also find use in the constructs of the present invention. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire acyltransferase protein, or a portion thereof. For example, 
where antisense inhibition of a given acyltransferase protein is desired, the entire sequence is 
not required. Furthermore, where acyltransferase sequences used in constructs are intended 
for use as probes, it may be advantageous to prepare constructs containing only a particular 
portion of a acyltransferase encoding sequence, for example a sequence which is discovered 
to encode a highly conserved acyltransferase region. 

The skilled artisan will recognize that there are various methods for the inhibition of 
expression of endogenous sequences in a host cell. Such methods include, but are not limited 
to antisense suppression (Smith, et al (1988) Nature 334:724-726) , co-suppression (Napoli, 
etal (1989) Plant Cell 2:279-289), ribozymes (PCT Publication WO 97/10328), and 
combinations of sense and antisense, such as those described by Waterhouse, et al. (1998) 
Proc. Natl. Acad. Sci. USA 95: 1 3959- 1 3964. Methods for the suppression of endogenous 
sequences in a host cell typically employ the transcription or transcription and translation of 
at least a portion of the sequence to be suppressed. Such sequences may be homologous to 
coding as well as non-coding regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 
constructs of this invention as well. Transcript termination regions may be provided by the 
DNA sequence encoding the acyltransferase or a convenient transcription termination region 
derived from a different gene source, for example, the transcript termination region which is 
naturally associated with the transcript initiation region. The skilled artisan will recognize 
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that any convenient transcript termination region which is capable of terminating transcription 
in a plant cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
acyltransferase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, et al. (1990) 
Proc. Natl. Acad. Sci. USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl. Acad. Sci. 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, 
transfected, or transgenic. A transgenic or transformed cell or plant also includes progeny of 
the cell or plant and progeny produced from a breeding program employing such a transgenic 
plant as a parent in a cross and exhibiting an altered genotype resulting from the presence of 
an introduced acyltransferase nucleic acid sequence. 

The term "introduced" in the context of inserting a nucleic acid sequence into a cell, 
means "transfection", or "transformation" or "transduction" and includes reference to the 
incorporation of a nucleic acid sequence into a eukaryotic or prokaryotic cell where the 
nucleic acid sequence may be incorporated into the genome of the cell (for example, 
chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous 
replicon, or transiently expressed (for example, transfected mRNA). 

Plant expression or transcription constructs having an acyltransferase as the DNA 
sequence of interest for increased or decreased expression thereof may be employed with a 
wide variety of plant life, particularly, plant life involved in the production of vegetable oils 
for edible and industrial uses. Plants of interest in the present invention include 
monocotyledenous and dicotyledenous plants. Most especially preferred are temperate 
oilseed crops. Plants of interest include, but are not limited to, rapeseed (Canola and High 
Erucic Acid varieties), sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, 
and com. Depending on the method for introducing the recombinant constructs into the host 
cell, other DNA sequences may be required. Importantly, this invention is applicable to 
dicotyledyons and monocotyledons species alike and will be readily applicable to new and/or 
improved transformation and regulation techniques. 

As used herein, the term "plant" includes reference to whole plants, plant organs (for 
example, leaves, stems, roots, etc.), seeds, and plant cells and progeny of same. Plant cell, as 
used herein includes, without limitation, seeds suspension cultures, embryos, meristematic 
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regions, callus tissue, leaves roots shoots, gametophytes, sporophytes, pollen, and 
microspores. The class of plants which can be used in the methods of the present invention is 
generally as broad as the class of higher plants amenable to transformation techniques, 
including both monocotyledenous and dicotyledenous plants. Particularly preferred plants of 
interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, and com. Most 
especially preferred plants include Brassica, soybean, and corn. 

As used herein, "transgenic plant" includes reference to a plant which comprises 
within its genome a heterologous polynucleotide. Generally, the heterologous polynucleotide 
is stably integrated within the genome such that the polynucleotide is passed on to successive 
generations. The heterologous polynucleotide may be integrated into the genome alone or as 
part of a recombinant expression cassette. "Transgenic" is used herein to include any cell, 
cell line, callus, tissue, plant part or plant, the genotype of which has been altered by the 
presence of heterologous nucleic acid including those transgenics initially so altered as well 
as those created by sexual crosses or asexual propagation from the initial transgenic. 

Thus a plant having within its cells a heterologous polynucleotide is referred to herein 
as a transgenic plant. The heterologous polynucleotide can be either stably integrated into the 
genome, or can be extra-chromosomal. Preferably, the polynucleotide of the present 
invention is stably integrated into the genome such that the polynucleotide is passed on to 
successive generations. The polynucleotide is integrated into the genome alone or as part of a 
recombinant expression cassette. "Transgenic" is used herein to include any cell, cell line, 
callus, tissue, plant part or plant, the genotype of which has been altered by the presence of 
heterologous nucleic acids including those transgenics initially so altered as well as those 
created by sexual crosses or asexual reproduction of the initial transgenics. 

As used herein, "heterologous" in reference to a nucleic acid is a nucleic acid that 
originates from a foreign species, or, if from the same species, is substantially modified from 
its native form in composition and/or genomic locus by deliberate human intervention. For 
example, a promoter operably linked to a heterologous structural gene is from a species 
different from that from which the structural gene was derived, or, if from the same species, 
one or both are substantially modified from their original form. A heterologous protein may 
originate from a foreign species, or, if from the same species, is substantially modified from 
its original form by deliberate human intervention. 
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As used herein, a "recombinant expression cassette" is a nucleic acid construct, 
generated recombinantly or synthetically, with a series of specified nucleic acid elements 
which permit transcription of a particular nucleic acid in a target cell. The recombinant 
' expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, 
5 plastid DNA, virus, or nucleic acid fragment. Typically, the recombinant expression cassette 
portion of an expression vector includes, among other sequences, a nucleic acid sequence to 
be transcribed and a promoter, 
i It is contemplated that the gene sequences may be synthesized, either completely or in 

^ part, especially where it is desirable to provide plant-preferred sequences. Thus, all or a 
10 portion of the desired structural gene (that portion of the gene which encodes the 

acyltransferase protein) may be synthesized using codons preferred by a selected host. Host- 
preferred codons may be determined, for example, from the codons used most frequently in 
the proteins expressed in a desired host species. 

One skilled in the art will readily recognize that antibody preparations, nucleic acid 
15 probes (DNA and RNA) and the like may be prepared and used to screen and recover 

"homologous" or "related" acyltransferase from a variety of plant sources. Homologous 
sequences are found when there is an identity of sequence, which may be determined upon 
comparison of sequence information, nucleic acid or amino acid, or through hybridization 
reactions between a known acyltransferase and a candidate source. Conservative changes, 
20 such as Glu/Asp, Val/Ile, Ser/Thr, Arg/Lys and Gln/Asn may also be considered in 

determining sequence homology. Amino acid sequences are considered homologous by as 
little as 25% sequence identity between the two complete mature proteins. (See generally, 
Doolittle, R.F., OF URFS and ORFS (University Science Books, CA, 1986.) 

Thus, other acyltransferase sequences can be obtained from the specific exemplified 

2 5 sequences provided herein. Furthermore, it will be apparent that one can obtain natural and 

synthetic sequences, including modified amino acid sequences and starting materials for 
synthetic-protein modeling from the exemplified sequences and from acyl transferases which 
are obtained through the use of such exemplified sequences. Modified amino acid sequences 
include sequences which have been mutated, truncated, increased and the like, whether such 

3 0 sequences were partially or wholly synthesized. Sequences which are actually purified from 

plant preparations or are identical or encode identical proteins thereto, regardless of the 
method used to obtain the protein or sequence, are equally considered naturally derived. 
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For immunological screening, antibodies to the acyltransferase protein can be 
prepared by injecting rabbits or mice with the purified protein or portion thereof, such 
methods of preparing antibodies being well known to those in the art. Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are more 
5 useful for gene isolation. Western analysis may be conducted to determine that a related 
protein is present in a crude extract of the desired plant species, as determined by cross- 
reaction with the antibodies to the acyltransferase protein. When cross-reactivity is observed, 
genes encoding the related proteins are isolated by screening expression libraries representing 
the desired plant species. Expression libraries can be constructed in a variety of commercially 
10 available vectors, including lambda gtl 1, as described in Sambrook, et al. (Molecular 

Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold 
Spring Harbor, New York). 

The nucleic acid sequences associated with acyltransferase proteins will find many 
uses. For example, recombinant constructs can be prepared which can be used as probes, or 
15 which will provide for expression of the acyltransferase protein in host cells to produce a 

ready source of the enzyme and/or to modify the composition of triglycerides found therein. 
Other useful applications may be found when the host cell is a plant host cell, either in vitro 
or in vivo. 

The modification of fatty acid compositions may also affect the fluidity of plant 
20 membranes. Different lipid concentrations have been observed in cold-hardened plants, for 
example. By this invention, one may be capable of introducing traits which will lend to chill 
tolerance. Constitutive or temperature inducible transcription initiation regulatory control 
regions may have special applications for such uses. 

As discussed above, nucleic acid sequence encoding an acyltransferase of this 
25 invention may include genomic, cDNA or mRNA sequence. By "encoding" is meant that the 
sequence corresponds to a particular amino acid sequence either in a sense or anti-sense 
orientation. By "extrachromosomal" is meant that the sequence is outside of the plant 
genome of which it is naturally associated. By "recombinant" is meant that the sequence 
contains a genetically engineered modification through manipulation via mutagenesis, 
3 0 restriction enzymes, and the like. 

Once the desired acyltransferase nucleic acid sequence is obtained, it may be 
manipulated in a variety of ways. Where the sequence involves non-coding flanking regions, 
the flanking regions may be subjected to resection, mutagenesis, etc. Thus, transitions, 
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transversions, deletions, and insertions may be performed on the naturally occurring 
sequence. In addition, all or part of the sequence may be synthesized. In the structural gene, 
one or more codons may be modified to provide for a modified amino acid sequence, or one 
or more codon mutations may be introduced to provide for a convenient restriction site or 
5 other purpose involved with construction or expression. The structural gene may be further 
modified by employing synthetic adapters, linkers to introduce one or more convenient 
restriction sites, or the like, 
i The nucleic acid or amino acid sequences encoding an acyltransferase of this 

\ invention may be combined with other non-native, or "heterologous", sequences in a variety 
10 of ways. By "heterologous" sequences is meant any sequence which is not naturally found 
joined to the acyltransferase, including, for example, combinations of nucleic acid sequences 
from the same plant which are not naturally found joined together. 

The DNA sequence encoding an acyltransferase of this invention may be employed in 
conjunction with all or part of the gene sequences normally associated with the 
15 acyltransferase. In its component parts, a DNA sequence encoding acyltransferase is 

combined in a DNA construct having, in the 5' to 3' direction of transcription, a transcription 
initiation control region capable of promoting transcription and translation in a host cell, the 
DNA sequence encoding plant acyltransferase and a transcription and translation termination 
region. 

2 0 Potential host cells include both prokaryotic cells, such as E.coli and eukaryotic cells 

such as yeast, insect, amphibian, or mammalian cells. A host cell may be unicellular or found 
in a multicellular differentiated or undifferentiated organism depending upon the intended 
use. Preferably, host cells of the present invention include plant cells, both 
monocotyledenous and dicotyledenous. Cells of this invention may be distinguished by 
25 having a sequence foreign to the wild-type cell present therein, for example, by having a 
recombinant nucleic acid construct encoding an acyltransferase therein. 

The methods used for the transformation of the host plant cell are not critical to the 
present invention. The transformation of the plant is preferably permanent, i.e. by integration 
of the introduced expression constructs into the host plant genome, so that the introduced 

3 0 constructs are passed onto successive plant generations. The skilled artisan will recognize 

that a wide variety of transformation techniques exist in the art, and new techniques are 
continually becoming available. Any technique that is suitable for the target host plant can be 
employed within the scope of the present invention. For example, the constructs can be 
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introduced in a variety of forms including, but not limited to as a strand of DNA, in a 
•plasmid, or in an artificial chromosome. The introduction of the constructs into the target 
plant cells can be accomplished by a variety of techniques, including, but not limited to 
* calcium-phosphate-DNA co-precipitation, electroporation, microinjection, Agrobacterium 
5 infection, liposomes or microprojectile transformation. The skilled artisan can refer to the 
literature for details and select suitable techniques for use in the methods of the present 
invention. 

I Normally, included with the DNA construct will be a structural gene having the 

i 

^ necessary regulatory regions for expression in a host and providing for selection of 

10 transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, 
heavy metal, toxin, etc., complementation providing prototrophy to an auxotrophic host, viral 
immunity or the like. Depending upon the number of different host species the expression 
construct or components thereof are introduced, one or more markers may be employed, 
where different conditions for selection are used for the different hosts. 

15 Where Agrobacterium is used for plant cell transformation, a vector may be used 

which may be introduced into the Agrobacterium host for homologous recombination with T- 
DNA or the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid 
containing the T-DNA for recombination may be armed (capable of causing gall formation) 
or disarmed (incapable of causing gall formation), the latter being permissible, so long as the 

20 vir genes are present in the transformed Agrobacterium host. The armed plasmid can give a 
mixture of normal plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host 
plant cells, the expression or transcription construct bordered by the T-DNA border region(s) 
will be inserted into a broad host range vector capable of replication in E. coli and 

25 Agrobacterium, there being broad host range vectors described in the literature. Commonly 
used is pRK2 or derivatives thereof. See, for example, Ditta, et aL y (Proc. Nat. Acad. Sci., 
U.S.A. (1980) 77:7347-7351) and EPA 0 120 515, which are incorporated herein by reference. 
Alternatively, one may insert the sequences to be expressed in plant cells into a vector 
containing separate replication sequences, one of which stabilizes the vector in E. coli, and 

3 0 the other in Agrobacterium. See, for example, McBride and Summerfelt (Plant Mol. Biol. 
(1990) 74:269-276), wherein the pRiHRI (Jouanin, et al., Mol Gen. Genet. (1985) 201:370- 
374) origin of replication is utilized and provides for added stability of the plant expression 
vectors in host Agrobacterium cells. 



WO 00/18889 PCT/US99/22231 

20 

Included with the expression construct and the T-DNA will be one or more markers, 
which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
5 particular marker employed is not essential to this invention, one or another marker being 
preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the 
bacteria killed, and the plant cells cultured in an appropriate selective medium. Once callus 

10 forms, shoot formation can be encouraged by employing the appropriate plant hormones in 
accordance with known methods and the shoots transferred to rooting medium for 
regeneration of plants. The plants may then be grown to seed and the seed used to establish 
repetitive generations and for isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which 

15 contain multiple expression constructs. Any means for producing a plant comprising a 
construct having a nucleic acid sequence of the present invention, and at least one other 
construct having another DNA sequence encoding an enzyme are encompassed by the present 
invention. For example, the expression construct can be used to transform a plant at the same 
time as the second construct either by inclusion of both expression constructs in a single 

20 transformation vector or by using separate vectors, each of which express desired genes. The 
second construct can be introduced into a plant which has already been transformed with the 
first expression construct, or alternatively, transformed plants, one having the first construct 
and one having the second construct, can be crossed to bring the constructs together in the 
same plant. 

25 In general, acyltransferase proteins are active in the transfer of acyl groups from a 

donor to a variety of different substrates. For example, diacylglycerol acyltransferases add 
acyl groups to diacylglycerol to form triacylglycerol (TAG), oracyl:CoA:cholesterol 
acyltransferase uses an acyl-CoA as a donor to transfer an acyl group to a sterol to form a 
sterol ester. Typically, the substrates include, but are not limited toglycerides, including 

3 0 mono and diglycerides, sterols, stanols, phosphatides, and the like. Donors include, but are 
not limited to acyl-CoA and acyl-ACP molecules. 
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The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and 
are not intended to limit the present invention. 

5 

EXAMPLES 

Example 1: RN A Isolations 

10 Total RNA from the inflorescence and developing seeds of Arabidopsis thaliana is 

isolated for use in construction of complementary (cDNA) libraries. The procedure is an 
adaptation of the DNA isolation protocol of Webb and Knapp (D.M. Webb and S.J. Knapp, 
(1990) Plant Molec. Reporter, 8, 180-185). The following description assumes the use of Ig 
fresh weight of tissue. Frozen seed tissue is powdered by grinding under liquid nitrogen. The 

15 powder is added to 10ml REC buffer (50mM Tris-HCl, pH 9, 0.8M NaCl, lOmM EDTA, 
0.5% w/v CTAB (cetyltrimethyl-ammonium bromide)) along with 0.2g insoluble 
polyvinylpolypyrrolidone, and ground at room temperature. The homogenate is centrifuged 
for 5 minutes at 12,000 xg to pellet insoluble material. The resulting supernatant fraction is 
extracted with chloroform, and the top phase is recovered. 

2 0 The RNA is then precipitated by addition of 1 volume RecP (50mM Tris-HCL pH9, 

lOrnM EDTA and 0.5% (w/v) CTAB) and collected by brief centrifugation as before. The 
RNA pellet is redissolved in 0.4 ml of 1M NaCl. The RNA pellet is redissolved in water and 
extracted with phenol/chloroform. Sufficient 3M potassium acetate (pH 5) is added to make 
the mixture 0.3M in acetate, followed by addition of two volumes of ethanol to precipitate the 

25 RNA. After washing with ethanol, this final RNA precipitate is dissolved in water and stored 
frozen. 

Alternatively, total RNA may be obtained using TRIzol reagent (BRL- 
Lifetechnologies, Gaithersburg, MD) following the manufacturers protocol. The RNA 
precipitate is dissolved in water and stored frozen. 



Example 2: Identification of Acyltransferase Homology Sequences 
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Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This software 
and hardware enables the use of the Smith-Waterman algorithm in searching DNA and 
5 protein databases using profiles as queries. The program used to query protein databases is 
profilesearch. This is a search where the query is not a single sequence but a profile based on 
a multiple alignment of amino acid or nucleic acid sequences. The profile is used to query a 
| sequence data set, i.e., a sequence database. The profile contains all the pertinent information 
\ for scoring each position in a sequence, in effect replacing the "scoring matrix" used for the 
10 standard query searches. The program used to query nucleotide databases with a protein 

profile is tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
profile query. As the search is running, sequences in the database are translated to amino acid 
sequences in six reading frames. The output file for tprofilesearch is identical to the output 
file for profilesearch except for an additional column that indicates the frame in which the 
15 best alignment occurred. 

The Smith- Waterman algorithm, (Smith and Waterman (1981) supra), is used to 
search for similarities between one sequence from the query and a group of sequences 
contained in the database. E score values as well as other sequence information, such as 
conserved peptide sequences of HXXXXD and PEG are used to identify related sequences. 

2 0 By using the conserved peptide sequence information, E score values of greater than E-12 and 

E-8 are considered. For example, the EST sequence originally used to identify ATAT2 had 
an E score of 0.0094, while the EST sequence originally used to identify ATLPAAT1 had an 
E score of 0.0868. 

A protein sequence of glycerol-3-phosphate from is. coli (Swiss Prot Accession 
25 P00482) is used to search the NCBI non-redundant protein database using BLAST. In the 

first round of searches, other membrane forms of G3PAAT are identified. In subsequent PSI- 
BLAST searches (Altschul, et al (1997) Nucleic Acids Res 25:3389-3402), LPAATs and 
other acyltransferases are identified. Using sequence alignment software programs, G3PAAT 
and different LPAAT amino acid sequences are aligned, and a profile is generated using a 

3 0 homologous sequence region, between amino acids 256 and 459 of the E. coli sequence. 

The identified 204 amino acid is used to query the protein database using PSI-BLAST. 
After 5 iterations of PSI-BLAST, the profile generated from this new query (Figure 1) 
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identified soluble forms of G3PAAT. Prior to this identification, no sequence homology had 
been identified between the membrane and soluble forms of G3PAAT. 



5 Example 3: Excision of PSI-BLAST Profile 

The profile generated from the queries using PSI-BLAST is excised from the hyper 
I text markup language (html) file. The worldwide web (www)/html interface to psiblast at 

i 

^ ncbi stores the current generated profile matrix in a hidden field in the html file that is 
10 returned after each iteration of psiblast. However, this matrix has been encoded into string62 
(s62) format for ease of transport through html. String62 format is a simple conversion of the 
values of the matrix into html legal ascii characters. 

The encoded matrix width (x axis) is 26 characters, and comprise the consensus 
characters, the probabilities of each amino acid in the order A,B,C,D,E,F,G,H,I,K,L,M,N, 
15 P,Q,R,S,T,V,W,X,Y,Z (where B represents D and N, and Z represents Q and E, and X 
represents any amino acid), gap creation value, and gap extension value. 

The length (y axis) of the matrix corresponds to the length of the sequences identified 
by PSI-BLAST. The order of the amino acids corresponds to the conserved amino acid 
sequence of the sequences identified using PSI-BLAST, with the N-terminal end at the top of 
2 0 the matrix. The probabilities of other amino acids at that position are represented for each 
amino acid along the x axis, below the respective single letter amino acid abbreviation. 

I 

Thus, each row of the profile consists of the highest scoring (consensus) amino acid, 
followed by the scores for each possible amino acid at that position in sequence matrix, the 
score for opening a gap that that position, and the score for continuing a gap at that position. 
25 The string62 file is converted back into a profile for use in subsequent searches. The 

gap open field is set to 1 1 and the gap extension field is set to 1 along the x axis. The gap 
creation and gap extension values are known, based on the settings given to the PSI-BLAST 
algorithm. The matrix is exported to the standard GCG profile form. This format can be read 
by GenWeb. 

30 The algorithm used to convert the string62 formatted file to the matrix is outlined in 

Table L 
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Table 1 

1. if encoded character z then the value is blast score min 

2. if encoded character Z then the value is blast score max 

3. else if the encoded character is uppercase then its value is (64-(ascii # of char)) 

4. else if the encoded character is a digit the value is ((ascii # of char)-48) 

5. else if the encoded character is not uppercase then the value is ((ascii # of char) - 87) 

6. ALL B positions are set to min of D and N amino acids at that row in sequence matrix 

7. ALL Z positions are set to min of Q amd E amino acids at that row in sequence matrix 

8. ALL X positions are set to min of all amino acids at that row in sequence matrix 

9. kBLAST_SCOREJVIAX=999; 

10. kBLAST_SCORE_MIN=-999; 

1 1 . all gap opens are set to 1 1 

12. all gap lens are set to 1 



15 



Example 4: Identification of Novel Acyltransferase Related Amino Acid Sequences 

20 The profile (Figure 1) is used in further queries to identify a number of previously 

unidentified proteins from yeast as novel acyltransferases. A protein is identified from an 
Arabidopsis protein sequence database (ATAT1) (SEQ ID NO:2). Sequences are also 
identified from nucleic acid databases (Table 2) 



25 



Table 2 



Database ID Number 


BLAST Search Hits 


Log probability 


Saccharomyces cerevisiae 






gi 1078509 


Limnanthes putative LPAAT 


e- 10 (SEQ ID 


NO:217) 






gi 586485 


Limnanthes putative LPAAT 


e- 13 (SEQ ID 


NO:218) 
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gi 320748 
NO:219) 

!gi 2506920- 

gi 549627 
NO:221) 
gi 213303] 
NO.-222) 
gi 2132939 
NO:223) 
gi 2132299 
NO:224) 
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similar totrin-— — " 

e-118(SEQ ID 



unidentified 



unidentified 
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To obtain the entire coding region corresponding to the A rabidopsis acyltransferase 
sequences, synthetic oligonucleotide primers are desired to ;.?rrMf\ * * * * 1 unus ^ 
partial cDNA i 'Vr.es containing - ^ r c ned 
T". » ;ding to trie .^.i*' 0 " . .v^lu Mi^cnces (Table 3) and used 

in Rapid Anemic < ; cDNA Ends (RACE) reactions (Frohman ex al (1988) Proc. Natl 
Acad. ScL USA 85:8998-9002) using the Marathon cDNA amplification kit (Ciontcch 
Laboratories Inc, Palo Alto, CA). Primers with an R designation are used for 5' RACE 
reactions, and primers with an F designation are used for 3' RACE reactions. 
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Table 3 
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ATAT2 



ATAT2R 1 CCATCCGCTTC AAGGG A ACG ACACCCATC A (SEQ ID NO: 1 35) 

ATAT2R2 TCCCTGTCTTGCTTGATGAACTTAAAGCTTG (SEQ ID NO: 136) 

ATAT2R3 ACAGCAGGAGTGTCTGATGATGGCAGATTC (SEQ ID NO: 137) 

ATAT3 



ATAT3R 1 ACTGGAGTTCC AGCCA A AAATGCACCTGTC (SEQ ID NO: 1 38) 
ATAT3R2 GATACACCCTTGAAATCAGGCGATTTTGCT (SEQ ID NO: 139) 

ATAT4 



ATAT4R1 TTGCAAATTCAATTCCTGTTTCACCGGGCC (SEQ ID NO: 140) 
ATAT4R2 GTTTTCTGCTATTCCAGAAGGCGTCAACAA (SEQ ID NO:141) 



ATAT5 

ATAT5R 1 CATTGAAGATCCGTCCGTGAAGTTNCCTTACC (SEQ ID NO: 142) 

ATAT5R2 TCGAGCTGTGATCGATGATTGGCTGTGAAG (SEQ ID NO: 143) 

ATAT5F1 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 144) 

ATAT5F2 GTCTCTTCAAAAACACACACACACGTCTCT (SEQ ID NO: 145) 



ATAT6 

H76348-F1 GTAGAGAGCCTTACTTGCTTCGGTTTAGTC (SEQ ID NO: 146) 

H76348-F2 ACGTCATCGTACCTGTTGCTATTGACTCAC (SEQ ID NO: 147) 

H76348-R1 ACTTTTCCATTGTCAGGGACTCCTCGACAC (SEQ ID NO:148) 

H76348-R2 ACGGTGTAGGAAGGGAAAGGATTCAAAAGG (SEQ ID NO: 149) 



ATAT7 

ATTS0193-F1 GCGATGAACTACAGAGTCGGATTCTTCCTC (SEQ ID NO: 150) 
ATTS0193-F2 CCGGTTT ACG A G ATT A CGTTCTTG A A CC AG (SEQ ID NO:151) 
ATTS0193-R1 CAATGGAGACAAGGCTCGAAAGTGCTAACC (SEQ ID NO: 152) 
ATTS0193-R2 ATTCTCTGAACATAGTTCGCCACGGTCATG (SEQ ID NO: 153) 
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ATAT8 



AA042618-F1 GAAATCCAACGCCTTCCCAATATCACTCTG (SEQ ID NO.154) 
AA042618-F2 CTTCAACTTTCCATCAGGATCTTGGCACGT (SEQ ID NO: 155) 
A A0426 1 8-R 1 ACC ACTTGTTAG AG AGCTTACCTGCTTAGG (SEQ ID NO: 1 56) 
AA0426 1 8-R2 TCCTACCTACACCATCC AATTTCTCG ACCC (SEQ ID NO: 1 57) 

AT ATI 1 



ATAT11R1 CTGCGTCAAGTGAGCAACTCAGTTCTTGCA (SEQ ID NO:158) 
AT AT 1 1 R2 TGGGA AGCAGCACGTTGTTC AGTATCGG AA (SEQ ID NO: 1 59) 
AT AT 1 1 R3 TAGCCTCTGTGTAATCTGTGCCCTCGGGG A (SEQ ID NO: 1 60) 



From the nucleic acid sequences obtained from the RACE reactions, protein sequence 
is predicted for each nucleic acid sequence using Macvector software. Nucleic acid sequences 
15 are provided for ATAT1 (SEQ ID NO:l), ATAT2 (SEQ ID NO:3), ATAT3 (SEQ ID NO:5), 
ATAT4 (SEQ ID NO:7), ATAT5 (SEQ ID NO:9), ATAT6 (SEQ ID NO: 10), ATAT7 (SEQ 
ID NO: 12), ATAT8 (SEQ ID NO: 14), ATAT9 (SEQ ID NO: 16), AT AT 10 (SEQ ID NO: 18), 
AT ATI 1 (SEQ ID NO:20) and ATLPAAT1 (SEQ ID NO:22), respectively. 

The protein sequence derived from the AT ATI (SEQ ID NO:2) nucleic acid sequence 
20 from Arabidopsis has a predicted molecular mass of 32.5 kDa, and a PI of 9.74. Alignment 
of the Arabidopsis acyltransferase with several LPAAT and G3PAAT shows that some of the 
domains that are conserved between LPAAT and G3PAAT are conserved in the new 
acyltransferase protein. 

The ATAT2 nucleic acid sequence is predicted to encode a 312 amino acid protein 

2 5 (SEQ ID NO:4), with a molecular weight of 34.6 kD, and a pi of 9.99. The ATAT2 protein 

may also contain 2 to 3 transmembrane domains. However, the protein encoded by the 
ATAT2 nucleic acid sequence may be longer than predicted because of the absence of an 
inframe stop codon upstream of the ATG start codon used. 

The ATAT3 nucleic acid sequence is predicted to encode a 398 amino acid protein 

3 0 (SEQ ID NO:6), with a molecular weight of 44.7 kD, and a pi of 5.62. The ATAT3 protein 

may contain 1 to 4 transmembrane domains. The ATAT4 nucleic acid sequence is predicted 
to encode a 317 amino acid protein (SEQ ID NO:8), with a molecular weight of 36.5 kD, and 
a pi of 9.67. The ATAT4 protein is predicted to have 2 to 5 transmembrane domains. 
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The ATLPAAT1 nucleic acid sequence is predicted to encode a 389 amino acid 
protein (SEQ ID NO:23), with a molecular weight of 43.7 kD, and a pi of 9.52. The 
ATLPAAT1 protein is predicted to have up to 3 transmembrane domains. The protein 
predicted from the ATLPAAT1 nucleic acid sequence is similar toLPAATs reported for 
5 Brassica, maize, and meadowfoam (described in PCT Publication WO 94/13814). The 
ATAT1 1 nucleic acid sequence is predicted to encode a 375 amino acid protein (SEQ ID 
NO:21), with a molecular weight of 43.5 kD, and a pi of 9.45. The deduced amino acid 
sequences of ATAT6 (SEQ ID NO:l 1), ATAT7 (SEQ ID NO: 13), ATAT8 (SEQ ID NO: 15), 
ATAT9 (SEQ ID NO:17), and AT AT 10 (SEQ ID NO: 19) are also provided 

10 A sequence region approximately 30 amino acids upstream through approximately 

100 amino acids downstream of the conserved amino acid sequences HXXXXD (Heath and 
Rock, (1998)7. BacterioL 180(6): 1425-1430) and PEG (Neuwald (1997) Curr Biol 1 r :R465- 
R466) of the predicted amino acid sequences derived from the nucleic acid sequences of 
AT ATI, ATAT2, ATAT3, ATAT4, ATAT6, ATAT7, ATAT8, ATAT9, AT AT 10, 

15 ATLPAAT1, and ATAT1 1 are compared to the amino acid sequences of lysophosphatidic 
acid acyltransferase (Jojoba AT (SEQ ID NO: 162, the nucleic acid sequence is provided in 
SEQ ID NO:161), maize AT (PCT Publication WO 94/13814), PLSC coco(GenBank 
accession 1098605), PLSC Lim(GenBank accession 1209507), PLSC,Ecoli (GenBank 
accession 1209507), and PLSC Yeast(GenBank accession 464422)) and glycerol-3-phosphate 

2 0 acyltransferase (PLSB Ecoli(GenBank accession 130326) and PLSB Mouse(GenBank 
accession 2498786)) (Figure 2), and similarities are identified (Figure 2 and Figure 3). 

Sequence comparisons reveal several classes of acyltransferases exist based on 
conserved amino acid sequences identified in the comparisons in Figure 2. For example, 
AT ATI, ATAT6, ATAT7, ATAT8, and ATAT9, contain the conserved amino acid 

2 5 sequences of VTYSXS(SEQ ID NO: 128), VXLTRXR(SEQ ID NO: 129), LXXGDLV(SEQ 

ID NO: 132) between the HXXXXD and PEG sequences. In addition, ATAT1, ATAT6, 
ATAT7, ATAT8, and ATAT9 also contain the conserved sequences CPEGT(SEQ ID NO: 
130) which comprises the PEG sequence, as well as IVPVA(SEQ ID NO: 131) and 
VANXXQ (SEQ ID NO: 134)(Figure 2) downstream of the PEG sequence. The sequences 

3 0 corresponding to ATAT1 , ATAT7, and ATAT9 are the most closely related in this class, with 

similarities between AT ATI and ATAT9 of 67.0%, between AT ATI and ATAT7 of 58.2% 
and between ATAT9 and ATAT7 of 63.9% (Figure 3B). 
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Sequence comparisons also demonstrate that the sequence of ATLPAAT1 is most 
closely related to the jojoba LPAAT (82.3% similar), and maize (78.0% similar). 

Furthermore, sequence analysis demonstrates that ATAT4 is the most divergent 
sequence with the highest similarity to A TAT 10 (18.5%). The highest similarity (15.3%) to a 
5 known sequence is with a meadowfoam (JAmnanthes douglassi) LPAAT. However, the 

sequences of ATAT4 and AT AT 10 share several conserved peptide sequences with the amino 
acid sequences of ATAT2 and ATAT3 (Figure 2), VXNHXS (SEQ ID NO: 127) where the H 
comprises the conserved H of the HXXXXD sequence and FXXGAF (SEQ' ID NO: 133) 
downstream of the PEG sequence. 

10 

Example 6: Identification of Additional Acyltransferase Sequences 

The novel Arabidopsis sequences identified above are used to search proprietary 

15 databases containing soybean and corn EST sequences. The results of this search identifies 
EST sequences from soybean (SEQ ID NO:24 through SEQ ID NO: 85) as well as from corn 
(SEQ ID NO: 86 through SEQ ID NO: 126) as encoding acyltransferase related proteins. 

Sequence comparisons between the various EST sequences and the complete 
Arabidopsis sequences reveals that the identified EST sequences demonstrate higher 

2 0 similarity to the various Arabidopsis sequences as determined by BLAST scores. 

Expressed Sequence Tag (EST) sequences from soybean and corn databases are 
identified which are most closely related by BLAST score to AT ATI (SEQ ID NOS:24-29 
and SEQ ID NOS:86-88, respectively), ATAT2 (SEQ ID NO: 30 and SEQ ID NO:89, 
respectively), ATAT3 (SEQ ID NOS:31-35 and SEQ ID NOS:90-94, respectively), ATAT4 

25 (SEQ ID NOS:36-44 and SEQ ID NOS:95-100, respectively), ATAT6 (SEQ ID NOS:45-49 
and SEQ ID NO: 101, respectively), ATAT7 (SEQ ID NOS:50-54 and SEQ ID NOS:102-103, 
respectively), ATAT8 (SEQ ID NOS:55-56 and SEQ ID NO: 104, respectively), ATAT9 
(SEQ ID NOS:57-79 and SEQ ID NOS: 105-1 1 1, respectively), AT AT 10 (SEQ ID NOS:80- 
81 and SEQ ID NO:l 12, respectively), AT ATI 1, (SEQ ID NOS:82-85 and SEQ ID 

30 NOS:123-126, respectively), and ATLPAAT1 (SEQ ID NOS: 1 13-122 respectively). 
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Example 7: Expression Construct Preparation 

A series of synthetic oligo nucleotide primers were prepared for use in Polymerase 
Chain Reactions (PCR) to amplify the entire DNA sequences encoding the various 
acyltransferase sequences identified above. The sequences are listed in Table 3. 



Table 3 

Primer Sequence (listed 5' -3') —————— : seq id 

NO: 

AT AT IF AAGCTTGCATGCGTCGACACAATGGTTCATGCGACCAAGT 163 
CAG 

ATAT1R GGTACCGTCGACTCACTTCTTGGTGTTGTTGATAG 164 

AT AT 2 F GG ATCCGCGGCCGCACAATGACGAGCTTTACTACTTCCCT 165 
TCAT 

ATAT2 R GGATCCCCTGCAGGTTAGAGATCCATTGATTCTGCAAT 166 

AT AT 3 F GGATCCGCGGCCGCATAATGGAATCAGAGCTCAAAGAT 167 

AT AT 3 R GGATCCCCTGCAGGTCATTCTTCTTTCTGATGGAAATC 168 

ATAT4 F GG ATCCGCGGCCGC AC AATG ACTCGTTC AC AAGATGTTTC 169 
A 

AT AT 4 R GGATCCCCTGCAGGTCACTTCTCTTCCAATCTAGCCAG 17 0 

AT AT 6 F GGATCCGCGGCCGCACAATGTCCGGTAATAAGATCTCGAC 171 
TCTTCA 

AT AT 6 R GGATCCCCTGCAGGTTATTTTTTCTTGACAACTCCGTTAT 172 
TACCGG 

ATAT7 F ATATCCGCGGCCGCACAATGGTTATGGAGCAAGCTGGAA 173 

ATAT7 R GGATCCCCTGCAGGTCAATGGAGACAAGGCTCGAAAGT 174 

AT AT 8 F GGATCCGCGGCCGCACAATGTCCGCCAAGATTTCAATATT 175 
CC 

AT AT 8 R GGATCCCCTGCAGGTTAATTTTTCTTAACTACTCCATT 17 6 

AT AT 9 F GGATCCGCGGCCGCACAATGGGAGCTC AGGAGAAACGGCG 177 
CC 

AT AT 9 R GGATCCCCTGCAGGTCACGTCTTCTCCTTCTTCACCGG 178 

ATAT1 OF GGATCCGCGGCCGCACAATGGCGGATCCTGATCTGTCTTC 179 
TCCT 

AT ATI OR GGATCCCCTGCAGGTTATGTTGGGGCCAAGTCAGGTGCAA 180 
AGAT 

AT ATI IF GGATCCGCGGCCGCAAAATGGAAAAAAAGAGTGTACCAAA 181 



SUBSTITUTE SHEET (RULE 26) 



WO 00/18889 PCT/US99/22231 

32 





TTCT 




ATAT11R 


GGATCCCCTGCAGGTTATTTGTTTACTAATTTGAGGGAAT 


182 




TTTTTG 




ATLPAAT 


TCGACCTGCAGGAAGCTTAAGGATGGTGATTGCTGC 


183 


IF 






ATLPAAT 


GGATCCGCGGCCGCTTACTTCTCCTTCTCCG 


184 


1R 






YSCAT1F 


GGATCCGCGGCCGCACAATGTCTTTTAGGGATGTCCTAG 


185 


YSCAT1R 


GGATCCCCTGCAGGTCAATCATCCTTACCCTTTGGTTTAC 


186 


YSCAT 1 


C 

ATGTCTTTTAGGGATGTCCTAGAAAGAGGAGATGAATTTT 


187 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 1 


TCAATCATCCTTACCCTTTGGTTTACCCTCTGGAGGCAGA 


188 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT2F 


GGATCCGCGGCCGCACAATGAAGCATTCCCAAAAATACCG 


189 




TAGG 




YSCAT2R 


GGATCCCCTGCAGGTCAATGATTTTTTTTCATCACAAATA 


190 


YSCAT 2 


C 

ATGAAGCATTCCCAAAAATACCGTAGGTATGGAATTTATG 


191 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 2 


TCAATGATTTTTTTTCATCACAAATACAAGAATAAGAAAA 


192 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT 


GGATCCGCGGCCGCACAATGGGTTTTGTTGATTTCTTCGA 


193 


3F 


AAC 




YSCAT 


GGATCCCCTGCAGGTTATTTGGTCTCAATTTTAATATTTT 


194 


3R 


TTTGC 




YSCAT 3 


ATGGGTTTTGTTGATTTCTTCGAAACATATATGGTCGGTT 


195 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 3 


TTATTTGGTCTCAATTTTAATATTTTTTTGCAAGGACTCG 


196 


KO R 


AGATTGTACTGAGAGTGCAC 




YSCAT 


GGATCCGCGGCCGCACAATGGAAAAGTACACCAATTGGAG 


197 


4F 


AGAC 




YSCAT 


GGATCCCCTGCAGGCTACTTCCTCTTTTTACGTTGATCGC 


198 


4R 


TG 








199 

-L. _y 


KO F 


CTGTGCGGTATTTCACACCG 




YSCAT 4 


CTACTTCCTCTTTTTACGTTGATCGCTGATATATTCCTTC 


200 


KO R 


AGATTGTACTGAGAGTGCAC 
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J. »J Vyii X 




RfiATrp^rnfirrnrArAATGCCTGrArCAAAACTCACGGA 


201 


5F 




G 




YSCAT 




GGATPPPPTGCAGGCTACGCATCTCCTTCTTTCCCTTC 


202 


5R 








YSCAT 


5 


ATGCCTGCACCAAAACTCACGGAGAAATCTGCCTCTTCCA 


203 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


5 


CTACGCATCTCCTTCTTTCCCTTCTTCTTCTTCTTCCTCT 


204 


KO R 




AGATTGTACTGAGAGTGCAC 




YSCAT 




GGATCCGCGGCCGCACAATGTCTGCTCCCGCTGCCGATCA 


205 


6F 




TAACGC 




YSCAT 

x uv^n x 




GGATCCCCTGCAGGTCATTCTTTCTTTTCGTGTTCTCTTT 

vjvjri x v • v x v_j v iivJw x v — / i. j. jl. v. — . x x jl v«» jl x x jl n»» vj jl vj jl jl v.- jl v— jl jl jl 


206 


6R 




TCTG 




YSCAT 


6 


ATGTCTGCTCCCGCTGCCGATCATAACGCTGCCAAACCTA 


207 


KO F 




CTGTGCGGTATTTCACACCG 




VQPAT 




TPATTPTTTPTTTTPPTPTTPTPTTTTPTGTPTTACPAGP 

X JTl X X V— XXX V— X X X X V-VJ X VJ X X V_ X V- X X X X v_» J. VJ J- V- JL X jnv_\»»jn.VJ\_ 


208 






APATTPTAPTPAPAPTPP Af 

rlVj/i. X X VJ X JTi.v_ X VJjri.vJJTi.\j X VJ\_ jri.V_ 




VCJP AT 




PPATPPPPPPPPPP AP A ATPPTPP ATPA AAA A ATAPPTPA 
VjVjjri. X v— v., Vjv- VjVJV_ V^OV^jr^v^irijrl luv. 1 vjv^.jri x v^ jrVjrv/T^T_ri_ri. x nvjv. x v_ jn. 


209 


7F 

# X 




TAAAGTTPG 

x jrurijrivj x x v_ \ j 




x OUn X 




PP A TPPPPTPP APPTP AAAAAATAAA AP A ATAAAGTTTAT 

vj Vj jrl. X Vw v.. Vw. Vw X vjV_jtiVJVJ X jr^-rt-rt-rtjTijrv x jrl_rljrljriv_. jrun. x jr^-runvj i i i jn jl 


210 


7R 




AAA P TAAP P 
jTi_£ijn.Vw. x nnv< \w 




X OvA X 


7 


A TPP TPP A TP A A A A A A T A PPTP A T A A APTTPG AAAAGTPG 

jri x VJv*. X Vj JTi X ^rl/^Tjnjri/A x xi\j v x v^n x jrvjrT-tiVj x x v-vjjnjnjnjnvj jl v^.vj 


211 


KO F 




PTGTPPGGTATTTPAPAPPG 

>»» X VJ X VJv*VJVJ X JTi X X 1 V^nv^Av^VU 




vcp AT 

X JLn X 


7 


TP A A A A A AT A A A AP A ATA AAPTTTATAAAPTAACCAAATT 

X V JrxJri-i^LcxJr^Jri X Jr^Jr^jri-TiV . /iJri. X jr^jr^jr^VJ XXX JTa X jrj-£XJri.Vw- X jrxJr^ v — \»Jririn J- x 


212 


KO R 




AGATTGTAPTGAGAGTGCAC 

JTiVjx*. x J- vj x nv» x vjjtivj/avj x \_j jt* v» 




YSCAT 

X <~J X 




GGATPPGPGGPPGPAPAATGAGTGTGATAGGTAGGTTCTT 

VJVJ JTi X \— \— VJT V VJ7 \J \ V, VJV_. JTi V -L VJJTiVJ JL VJ JL VJXa J. iiVJVJ JL iiVJVJ X X > — • X JL 


213 


8F 




G 




YSCAT 




GGATCCCCTGCAGGTTAATGCATCTTTTTTACAGATGAAC 


214 


8R 




c 




YSCAT 


8 


ATGAGTGTGATAGGTAGGTTCTTGTATTACTTGAGGTCCG 


215 


KO F 




CTGTGCGGTATTTCACACCG 




YSCAT 


8 


TTAATGCATCTTTTTTACAGATGAACCTTCGTTATGGGTA 


216 


KO R 




AGATTGTACTGAGAGTGCAC 





The entire coding regions for each of the acyltransferase sequences were amplified 
using the respective primers listed in the Table 3 above, cloned into the vector pCR2.1Topo 
(Invitrogen) or pZero (Invitrogen), and labeled as pCGN8558 (AT ATI), pCGN8564 
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(ATAT2), pCGB8565 (ATAT3), pCGN8566 (ATAT4), P CGN8918 (ATAT6), 
P CGN8913 (ATAT7), pCGN8904 (ATAT8), pCGN9970 (ATAT9), pCGN9940 
(AT AT 10), P CGN8567 (AT AT 1 1 ), pCGN8632 (ATLPAAT1), P CGN9901 (YSCAT1 
also referred lo as gi2132299), pCGN9902 (YSCAT2, also referred to as gi 1078509), 
pCGN9903 (YSCAT3, also referred to as gi2 132939), pCGN9904 (YSCAT4, also 
referred to gi213303 1), pCGN9905 (YSCAT5, also referred to as gi320748), pCGN9906 
(YSCAT6, also referred to as gi549627), pGGN9907 (YSCAT7, also referred to as 
gi586485), and pCGN9908 (YSCAT8, also referred to as gi464422). The nucleic acid 
sequences for the respective yeast acyltransferase are provided YSCAT1 (SEQ ID 
NO:225), YSCAT2 (SEQ ID NO:226), YSCAT3 (SEQ ID NO:227), YSCAT4 (SEQ ID 
NO:228), YSCAT5 (SEQ ID NO:229), YSCAT6 (SEQ ID NO.230), YSCAT7 (SEQ ID 
NO:231), and YSCAT8 (SEQ ID NO:232). 
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7A. Baculovirus Expression Constructs 

Constructs are prepared to direct the expression of the A rabidopsis AT AT sequences 
in cultured insect cells. The entire coding regions of ATAT1, 2, 3, 4, 6, 7, 8, 9, 10, and 1 1 are 
cloned into the vector pFastBacl (Gibco-BRL, Gaithersburg, MD) digested withAforl and 
5 Pst\. The respective coding sequences were cloned as NotUSseS3&U fragments. Double 
stranded DNA sequence was obtained to verify that no errors were introduced by PCR 
amplification. The resulting plasmid were designated pCGN9723 (ATAT1), pCGN9724 
(ATAT2), pCGN9725 (ATAT3), pCGN9726 (ATAT4), pCGN9727 (ATAT5), pCGN9728 
(ATAT7), pCGN9729 (ATAT8), pCGN9991 (ATAT9) P CGN9730 (AT AT 10), pCGN9731 
10 (AT ATI 1). 

7B. Plant Expression Construct Preparation 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
15 more useful for cloning large DNA fragments containing multiple restriction sites, and to 

allow the cloning of multiple napin fusion genes into plant binary transformation vectors. An 
adapter comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAA 
(SEQ ID NO:233) AT was ligated into the cloning vector pBC SK+ (Stratagene) after 
2 0 digestion with the restriction endonuclease BssHII to construct vector pCGN7765. Plamids 
pCGN3223 and pCGN7765 were digested with NotI and ligated together. The resultant 
vector, pCGN7770, contains the pCGN7765 backbone with the napin seed specific 
expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 

2 5 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been 

replaced with the double CAMV 35S promoter and the tmJ polyadenylation and 
transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 
pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The 

3 0 polylinker of pCGN1558 was replaced as a HindIII/Asp718 fragment with apolylinker 

containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI,and NotI. 
The Asp718 and Hindlll restriction endonuclease sites are retained in pCGN5139. 
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A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
transcriptional termination regions. 

The plasmid pCGN8618 was constructed by ligating oligonucleotides 5'- 
5 TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:234) and 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) (SEQ ID NO:235) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3' 
I region was excised from pCGN8618 by digestion with Asp718I; the fragment was blunt- 
^ ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 
10 had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment. A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated pCGN8622. 
15 The plasmid pCGN8619 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' ) (SEQ ID NO:236) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3' ) (SEQ ID NO:237) into Sall/Xhol- 
digested pCGN7770. A fragment containing the napin promoter, poly linker and napin 3* 
region was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt- 
20 ended by filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that 
( had been digested with Asp718I and Hindlll and blunt-ended by filling in the 5' overhangs 
with Klenow fragment A plasmid containing the insert oriented so that the napin promoter 
was closest to the blunted Asp718I site of pCGN5139 and the napin 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
25 and the integrity of cloning junctions. The resulting plasmid was designated pCGN8623. 
The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' ) (SEQ ID NO:238) and 5'- 
CCTGCAGGAAGCTTGCGGCCGCGGATCC-3' ) (SEQ ID NO:239) into Sall/Sacl- 
digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 
30 was removed from pCGN8620 by complete digestion with Asp718I and partial digestion with 
Notl. The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5* overhangs with Klenow fragment. A plasmid containing the insert 
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oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8624. 
5 The plasmid pCGN8621 was constructed by ligating oligonucleotides 5'- 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' ) (SEQ ID NO:240) and 5'- 
GG ATCCGCGGCCGCA AGCTTCCTGCAGG-3 ' ) (SEQ ID NO:241) into Sall/Sacl- 
I digested pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region 

i 

^ was removed from pCGN8621 by complete digestion with Asp718I and partial digestion with 
10 Notl. The fragment was blunt-ended by filling in the 5' overhangs withKlenow fragment 
then ligated into pCGN5139 that had been digested with Asp718I and Hindlll and blunt- 
ended by filling in the 5 ? overhangs with Klenow fragment. A plasmid containing the insert 
oriented so that the d35S promoter was closest to the blunted Asp718I site of pCGN5139 and 
the tml 3' was closest to the blunted Hindlll site was subjected to sequence analysis to 
15 confirm both the insert orientation and the integrity of cloning junctions. The resulting 
plasmid was designated pCGN8625. 

The coding regions of the various acyltransferase sequences were cloned as 
NotVSse83871 fragments into pCGN8622, pCGN8623, pCGN8624, and pCGN8625, for 
expression in sense or antisense orientations from a tissue preferential promoter, napin, or the 
2 0 35S promoter. Fragments which were cloned into the pCGN8622 vector created the 

constructs pCGN8901 (ATAT1), pCGN8571 (ATAT2), pCGN8909 (ATAT3), pCGN8596 
(ATAT4), pCGN8919 (ATAT6), pCGN8914 (ATAT7), pCGN8905 (ATAT8), pCGN9973 
(ATAT9), pCGN9942 (AT AT 10), pCGN8575 (ATAT1 1), and pCGN8633 (ATLPAAT1) for 
the sense expression of the respective coding sequences from the napin promoter. Fragments 

2 5 which were cloned into the pCGN8623 vector created the constructs pCGN8900 (ATAT1), 

pCGN8572 (ATAT2), pCGN8910 (ATAT3), pCGN8597 (ATAT4), pCGN8920 (ATAT6), 
pCGN8915 (ATAT7), pCGN8906 (ATAT8), pCGN9972 (ATAT9), pCGN9943 (AT AT 10), 
pCGN8576 (AT ATI 1), and pCGN8634 (ATLPAAT1) for the antisense expression of the 
respective coding sequences from the napin promoter. Fragments which were cloned into the 

3 0 pCGN8624 vector created the constructs pCGN8903 (AT ATI), pCGN8573 (ATAT2), 

pCGN891 1 (ATAT3), pCGN8598 (ATAT4), pCGN8921 (ATAT6), pCGN8916 (ATAT7), 
pCGN8907 (ATAT8), pCGN9971 (ATAT9), pCGN9944 (ATAT10), pCGN8577 (AT ATI 1), 
and pCGN8635 (ATLPAAT1) for the sense expression of the respective coding sequences 
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from the 35S promoter. Fragments which were cloned into the pCGN8625 vector created the 
constructs pCGN8902 (AT AT 1 ) and pCGN9974 (ATAT9) for the antisense expression of 
the respective coding sequences from the 35S promoter. 

In addition, the yeast acyltransferase coding sequences were cloned into the vector 
5 pCGN8624 creating the constructs pCGN9926 (YSCAT1), pCGN9927 (YSCAT2), 
pCGN9928 (YSCAT3), pCGN9929 (YSCAT4), pCGN9930 (YSCAT5), pCGN9931 
(YSCAT6), pCGN9932 (YSCAT7), and pCGN9933 (YSCAT8). These constructs allow for 
the sense expression of the respective acyltransferase coding sequences from the 35S 
promoter in plant cells. 

10 

Example 8: Plant Transformation 

A variety of methods have been developed to insert a DNA sequence of interest into the 
15 genome of a plant host to obtain the transcription or transcription and translation of the sequence 
to effect phenotypic changes. 

Transgenic Brassica plants are obtained by Agrobacterium-rncdiated transformation 
as described by Radke et al. (Theor. Appl Genet, (1988) 75:685-694; Plant Cell Reports 
(1992) 77:499-505). Transgenic Arabidopsis thaliana plants may be obtained by 
2 0 Agrobacterium-medialed transformation as described by Valverkens et aL, (Proc. Nat. Acad. 
ScL (1988) 55:5536-5540), or as described by Bent et al. ((1 994), Science 265:1856-1860), or 
Bechtold et al. ((1993), C.RAcad.Sci, Life Sciences 316: 1 194-1 199) or Clough, et al (1998) 
Plant J., 16:735-43. Other plant species may be similarly transformed using related 
techniques. 

2 5 Alternatively, microprojectile bombardment methods, such as described by Klein et 

al. {Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

The above results demonstrate that the nucleic acid sequences identified encode 
proteins which are related to protein sequences encoding acyltransferase proteins. Such 

3 0 acyltransferase sequences find use in preparing expression constructs for plant 

transformations. 

All publications and patent applications mentioned in this specification are indicative 
of the level of skill of those skilled in the art to which this invention pertains. All 
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publications and patent applications are herein incorporated by reference to the same extent as 
if each individual publication or patent application was specifically and individually indicated 
to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be obvious that 
certain changes and modifications may be practiced within the scope of the appended claim. 
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Claims 

What is Claimed is: 

1 . An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 127 
(VxNHxS) wherein the H is the conserved Histidine residue in the conserved peptide 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

2. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 128 
(VTYSxS) within about 30 amino acids downstream from the conserved amino acid sequence 
HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

3. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 129 
(VxLTRxR) within about 60 amino acids downstream from the conserved amino acid 
sequence HXXXXD of said acyltransferase-like protein, x representing any amino acid. 

4. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 132 
(LxxGDLV) within about 20 amino acids upstream of the conserved amino acid sequence 
PEG of said acyltransferase-like protein, x representing any amino acid. 

5. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 130 (CPEGT) 
containing the conserved amino acid sequence PEG of said acyltransferase-like protein. 
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6. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 133 
(FxxGAF) within about 20 amino acids downstream from the conserved amino acid sequence 
PEG of said acyltransferase-like protein, x representing any amino acid. 

7. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 131 (IVPVA) 
within about 40 amino acids downstream from the conserved amino acid sequence PEG of 
said acyltransferase-like protein. 

8. An isolated DNA sequence encoding an enzyme of the class of acyltransferase-like 
proteins, 

wherein said enzyme includes the amino acid sequence of SEQ ID NO: 134 
(VANxxQ) within about 1 10 amino acids downstream from the conserved amino acid 
sequence PEG of said acyltransferase-like protein, x representing any amino acid. 

9. A DNA sequence encoding an enzyme of the class of acyltransferase-like proteins, 
said DNA sequence obtainable by the steps comprising: 

(a) using the profile of Figure 1 to search a nucleic acid sequence database; 

(b) obtaining a probability score for nucleic acid sequences in said sequence 
database using the Smith-Waterman algorithm; and 

( c ) selecting a nucleic acid sequence having a probability score of less than about 1 . 

10. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an encoding sequence. 

11. The DNA encoding sequence according to Claim 9, wherein said DNA sequence 
is an EST. 
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12. The DNA encoding sequence according to any one of Claims 1 to 1 1, wherein 
said acyltransferase-like protein is from a plant 

13. A construct comprising a DNA sequence of any one of Claims 1 to 1 1 linked to a 
5 heterologous transcriptional and translational initiation region functional in a host cell. 

14. The construct according to Claim 13 wherein said host cell is a plant cell. 

15. A plant cell comprising a DNA construct according to Claim 13. 

10 

16. A plant comprising a cell according to Claim 15. 

17. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
15 like protein is from Arabidopsis thaliana. 

18. The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from corn. 

20 19 . The DNA encoding sequence of Claim 18 wherein said sequence comprises and 

EST selected from the group consisting of SEQ ID NO: 86 through SEQ ID NO: 126. 

2 0 . The DNA encoding sequence of any one of 1 to 1 1 wherein said acyltransferase- 
like protein is from soybean. 

25 

2 1 . The DNA encoding sequence of Claim 20 wherein said sequence comprises and 
EST selected from the group consisting of SEQ ID NO: 24 through SEQ ID NO: 85. 

22 . The DNA encoding sequence of any one of Claims 2, 3, 4, 5, 7 and 8 wherein 
3 0 said acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 1, SEQ 
ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14 and SEQ ID NO: Id 
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23 . The DNA encoding sequence of either of Claim 1 and Claim 6 wherein said 
acyltransferase-like protein is selected from the group consisting of SEQ ID NO: 3, SEQ ID 
NO: 5, SEQ ID NO: 7 and SEQ ID NO: 18. 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyltrans f erases 
<130> 17029/00/WO 



<140> 
<141> 

<150> 60/101,939 

<151> 1998-09-25 

<160> 241 

<17 0> Patentln Ver. 2.0 

<210> 1 

<211> 869 

<212> DNA 

<213> Arabidopsis sp . 



<400> 1 

atggttcatg 

gtcttccatg 

ctatggcttc 

ctgaaagatt 

atcgtcctcc 

ccgcgcttga 

acagtgtctc 

accgtgccac 

gtccggaagg 

agctaagcga 

ccacagttag 

gctatgaagc 

agactcctat 

aatgcaccga 

tggagtctat 



cgaccaagtc 
atgggcgttt 
cttttggttt 
tgtccgttac 
acctccttcc 
tcccatcatc 
tcgtctctcc 
cgatgctgcc 
cacgacgtgt 
ccggattgtg 
gggtgtgaag 
cactttcttg 
agaggtggct 
acttactcgc 
caacaacacc 



agccacaacg 
agcgcaacgt 
catctctcca 
acttacgaga 
cctggaactc 
gtcgctattg 
cttatgcttt 
aacatgagaa 
agagaagagt 
ccagtagcga 
ttttgggacc 
gatcgtttgc 
aattacgtcc 
aaggataaat 
aagaagtga 



attccaaaag 
ccaactccgt 
tcattcgcgt 
tgctcgggat 
ttggcaacct 
ctcttggacg 
ctcctattcc 
aacttctcga 
atctactgag 
tgaactgtaa 
cttacttctt 
ctgaagaaat 
agaaagttat 
atcttttgct 



aacgcttaaa 
taaacgccat 
ctacttcaac 
ccacttaacc 
ctatgtcctt 
taagatctgt 
tgctgttgcc 
gaaaggcgac 
atttagcgct 
acaaggaatg 
cttcatgaac 
gactgtcaac 
cggcgcggtt 
tggaggtaat 



gaaccgcata 
tatcacatac 
ctccctttac 
attcgtggtc 
aaccaccgta 
tgcgtcactt 
ctcacccgtg 
ttggtgatat 
ctattcgcag 
ttcaacggga 
ccaagaccaa 
ggtggtggca 
ttgggcttcg 
gacggcaagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

869 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 



<400> 2 
Met Val 
1 

Lys Asn 



Pro Leu 

Leu Ser 
50 

Val Arg 
65 

His Arg 
Leu Asn 



His Ala Thr 
5 

Arg lie Val 
20 

Asn Ala lie 
35 

lie lie Arg 
Tyr Thr Tyr 



Pro Pro Pro 
85 

His Arg Thr 
100 



Lys Ser 

Phe His 

He Thr 

Val Tyr 
55 

Glu Met 
70 

Pro Ser 
Ala Leu 



He Pro Lys Glu Arg Leu 
15 

Leu Ala Gin Arg Pro Thr 
30 

Leu Pro Phe Gly Phe He 
45 

Pro Leu Pro Glu Arg Phe 
60 

His Leu Thr He Arg Gly 
75 80 

Pro Gly Thr Leu Gly Asn Leu Tyr Val 
90 95 

Asp Pro He He Val Ala He Ala Leu 
105 HO 



Ala Thr Thr 
10 

Asp Gly Arg 
25 

Tyr Leu Trp 
40 

Phe Asn Leu 
Leu Gly He 
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SEQUENCE LISTING 



<110> Lassner, Michael W 
Emig, Robin A 
Ruezinsky, Diane 
Van Eenennaam, Alison 

<120> Novel Plant Acyltransf erases 

<130> 17029/00/WO 

<140> 
<141> 

<150> 60/101,939 
<151> 1998-09-25 

<160> 241 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 869 
<212> DNA 

<213> Arabidopsis sp. 



<400> 1 

atggttcatg 

gtcttccatg 

ctatggcttc 

ctgaaagatt 

atcgtcctcc 

ccgcgcttga 

acagtgtctc 

accgtgccac 

gtccggaagg 

agctaagcga 

ccacagttag 

gctatgaagc 

agactcctat 

aatgcaccga 

tggagtctat 



cgaccaagtc 
atgggcgttt 
cttttggttt 
tgtccgttac 
acctccttcc 
tcccatcatc 
tcgtctctcc 
cgatgctgcc 
cacgacgtgt 
ccggattgtg 
gggtgtgaag 
cactttcttg 
agaggtggct 
acttactcgc 
caacaacacc 



agccacaacg 
agcgcaacgt 
catctctcca 
acttacgaga 
cctggaactc 
gtcgctattg 
cttatgcttt 
aacatgagaa 
agagaagagt 
ccagtagcga 
ttttgggacc 
gatcgtttgc 
aattacgtcc 
aaggataaat 
aagaagtga 



attccaaaag 
ccaactccgt 
tcattpgcgt 
tgctcgggat 
ttggcaacct 
ctcttggacg 
ctcctattcc 
aacttctcga 
atctactgag 
tgaactgtaa 
cttacttctt 
ctgaagaaat 
agaaagttat 
atcttttgct 



aacgcttaaa 
taaacgccat 
ctacttcaac 
ccacttaacc 
ctatgtcctt 
taagatctgt 
tgctgttgcc 
gaaaggcgac 
atttagcgct 
acaaggaatg 
cttcatgaac 
gactgtcaac 
cggcgcggtt 
tggaggtaat 



gaaccgcata 
tatcacatac 
ctccctttac 
attcgtggtc 
aaccaccgta 
tgcgtcactt 
ctcacccgtg 
ttggtgatat 
ctattcgcag 
ttcaacggga 
ccaagaccaa 
ggtggtggca 
ttgggcttcg 
gacggcaagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

869 



<210> 2 
<211> 289 
<212> PRT 

<213> Arabidopsis sp. 

MeS!°Va? His Ala Thr Lys Ser Ala Thr Thr He Pro Lys Glu Arg Leu 
1 5 1° 10 

Lys Asn Arg He Val Phe His Asp Gly Arg Leu Ala Gin Arg Pro Thr 
20 25 30 

Pro Leu Asn Ala He He Thr Tyr Leu Trp Leu Pro Phe Gly Phe He 
35 40 45 

Leu Ser He He Arg Val Tyr Phe Asn Leu Pro Leu Pro Glu Arg Phe 
50 55 60 

Val Arg Tyr Thr Tyr Glu Met Leu Gly He His Leu Thr He Arg Gly 
65 ~ " 70 75 

His Arg Pro Pro Pro Pro Ser Pro Gly Thr Leu Gly Asn Leu Tyr Val 
85 90 

Leu Asn His Arg Thr Ala Leu Asp Pro He He Val Ala He Ala Leu 
100 105 ■ L - LU 
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Gly Arg Lys 
115 

Met Leu Ser 
130 

Asp Ala Ala 
145 

Cys Pro Glu 



Ala Leu Phe 



Cys Lys Gin 
195 

Trp Asp Pro 
210 

Thr Phe Leu 
225 

Lys Thr Pro 



Val Leu Gly 



Leu Leu Gly 
275 



lie Cys Cys Val 



Pro lie Pro Ala 
135 

Asn Met Arg Lys 
150 

Gly Thr Thr Cys 
165 

Ala Glu Leu Ser 
180 

Gly Met Phe Asn 



Tyr Phe Phe Phe 
215 

Asp Arg Leu Pro 
230 

lie Glu Val Ala 
245 

Phe Glu Cys Thr 
260 

Gly Asn Asp Gly 



Thr Tyr Ser 
120 

Val Ala Leu 

Leu Leu Glu 

Arg Glu Glu 
170 

Asp Arg lie 
185 

Gly Thr Thr 
200 

Met Asn Pro 
Glu Glu Met 



Asn Tyr Val 
250 

Glu Leu Thr 
265 

Lys Val Glu 
280 



Val Ser Arg Leu Ser Leu 
125 



Thr Arg 
140 

Lys Gly 
155 

Tyr Leu 

Val Pro 

Val Arg 

Arg Pro 
220 

Thr Val 
235 

Gin Lys 
Arg Lys 
Ser He 



Asp Arg Ala Thr 



Asp Leu Val He 
160 

Leu Arg Phe Ser 
175 

Val Ala Met Asn 
190 

Gly Val Lys Phe 
205 

Ser Tyr Glu Ala 



Asn Gly Gly Gly 
240 

Val He Gly Ala 
255 

Asp Lys Tyr Leu 
270 

Asn Asn Thr Lys 
285 



Lys 



<210> 3 
<211> 939 
<212> DNA 

<213> Arabidopsis sp . 



<400> 3 

atgacgagct 

agacgtactg 

gataagaaat 

tcaggagctg 

ctcagaggga 

atgattattg 

ttcattgcta 

ggtttggaga 

tttctggata 

gggatattcg 

aagcggatgg 

aagggagcat 

tctttcaaga 

acgctaatgg 

aatgtgagag 

gaggccagaa 



ttactacttc 
gcattcaatg 
cacctagatc 
caacccctga 
tattcttttg 
ggcatccgtt 
aactttgggc 
atctgccatc 
tctacacact 
taattcccat 
acccaagaag 
ctgtgttttt 
aaggcgcatt 
gaacaggcaa 
ttatcatcca 
gcaagattgc 



ccttcatgct 
gtctaaccgc 
aagtcaattg 
ctcttctttt 
tgttgttgct 
cgtccttctc 
ttccataagc 
atcagacact 
tcttagtctt 
catcggttgg 
ccaagtggat 
cttcccagaa 
tacagtggct 
aatcatgcca 
taaaccaata 
agaatcaatg 



gtcccgagtg 
tctttaagac 
gcaagagata 
cctgaaccag 
ggcatttcgg 
ttcgatccct 
atttatccgt 
cctgctgtat 
ggaaaaagct 
gccatgtcca 
tgcttaaaac 
ggaacacgga 
gcgaagaccg 
acgggtagtg 
catggaagca 
gatctctaa 



aaaaatttat 
atgatcctta 
tcactgtgag 
agattaagtt 
ctacttttct 
ataggagaaa 
tttacaaaat 
atgtttcaaa 
ttaagttcat 
tgatgggtgt 
gctgcatgga 
gtaaggatgg 
gagttgcagt 
aaggtatact 
aagcggatgt 



gggcgaaaca 
cagatttctt 
agcagatctt 
gagctcaaga 
cattgtcctg 
attccaccac 
caacatcgag 
ccaccaaagt 
cagcaagaca 
cgttcccttg 
acttttaaag 
tcggttaggt 
agttccaata 
gaaccatggg 
tctttgcaac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

939 



<210> 4 
<211> 312 
<212> PRT 

<213> Arabidopsis sp . 
<400> 4 

Met Thr Ser Phe Thr Thr Ser Leu His Ala Val 
15 10 



Pro Ser Glu Lys Phe 
15 



Met Gly Glu Thr Arg Arg Thr Gly lie Gin Trp Ser Asn Arg Ser Leu 
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4 

20 25 30 

Arg His Asp Pro Tyr Arg Phe Leu Asp Lys Lys Ser Pro Arg Ser Ser 
35 40 45 

Gin Leu Ala Arg Asp lie Thr Val Arg Ala Asp Leu Ser Gly Ala Ala 
50 55 60 

Thr Pro Asp Ser Ser Phe Pro Glu Pro Glu lie Lys Leu Ser Ser Arg 
65 70 75 80 

Leu Arg Gly lie Phe Phe Cys Val Val Ala Gly lie Ser Ala Thr Phe 
85 90 95 

Leu lie Val Leu Met lie lie Gly His Pro Phe Val Leu Leu Phe. Asp 
100 105 110 

Pro Tyr Arg Arg Lys Phe His His Phe lie Ala Lys Leu Trp Ala Ser 
115 120 125 

lie Ser lie Tyr Pro Phe Tyr Lys lie Asn lie Glu Gly Leu Glu Asn 
130 135 140 

Leu Pro Ser Ser Asp Thr Pro Ala Val Tyr Val Ser Asn His Gin Ser 
145 150 155 160 

Phe Leu Asp lie Tyr Thr Leu Leu Ser Leu Gly Lys Ser Phe Lys Phe 
165 170 175 

lie Ser Lys Thr Gly lie Phe Val lie Pro lie lie Gly Trp Ala Met 
180 185 190 

Ser Met Met Gly Val Val Pro Leu Lys Arg Met Asp Pro Arg Ser Gin 
195 200 205 

Val Asp Cys Leu Lys Arg Cys Met Glu Leu Leu Lys Lys Gly Ala Ser 
210 . 215 220 

Val Phe Phe Phe Pro Glu Gly Thr Arg Ser Lys Asp Gly Arg Leu Gly 
225 230 235 240 

Ser Phe Lys Lys Gly Ala Phe Thr Val Ala Ala Lys Thr Gly Val Ala 
245 250 255 

Val Val Pro lie Thr Leu Met Gly Thr Gly Lys He Met Pro Thr Gly 
260 265 270 

Ser Glu Gly He Leu Asn His Gly Asn Val Arg Val He He His Lys 
275 280 285 

Pro He His Gly Ser Lys Ala Asp Val Leu Cys Asn Glu Ala Arg Ser 
290 295 300 

Lys He Ala Glu Ser Met Asp Leu 
305 310 

<210> 5 
<211> 1197 
<212> DNA 

<213> Arabidopsis sp. 
<400> 5 

atggaatcag agctcaaaga tttgaattcg aattcgaatc ctccgtcgag caaagaggac 60 
cggccgttac tgaaatcaga atccgatttg gcggctgcca ttgaagagtt agacaaaaag 120 
ttcgcacctt acgcgaggac cgatttgtat gggacgatgg gtttgggtcc tttcccgatg 180 
acggagaata ttaaattggc ggttgcattg gtgactcttg ttccattgcg gtttcttctc 240 
tcgatgagca tcttgcttct ctattacttg atttgtaggg tatttacgct gttttctgct 300 
ccttatcgtg ggccagagga agaggaagat gaaggtggag ttgtttttca ggaagattat 3 60 
gctcacatgg aaggttggaa acggactgtt atcgtccggt ctgggaggtt tctctctagg 420 
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gttttgcttt tcgtttttgg gttttattgg attcacgaga gctgtccaga tcgagattca 480 

gacatggatt ctaatcctaa aactacttct acagagatta accagaaagg ggaagccgcc 540 

acggaggaac ctgaaagacc tggagccatt gtgtccaatc atgtttcgta cttggacatt 600 

ttgtatcata tgtctgcttc ttttccaagt tttgttgcca agagatcagt gggcaaactt 660 

cctcttgttg gcctcattag caaatgcctt ggttgtgtct atgttcaaag agaagcaaaa 720 

tcgcctgatt tcaagggtgt atctggcaca gtaaatgaaa gagttcgaga agctcatagc 780 

aataaatctg ctccaactat tatgcttttt ccagaaggaa caactaccaa tggagactac 840 

ttacttacat tcaagacagg tgcatttttg gctggaactc cagttcttcc ggtaatatta 900 

aaatatccgt atgagcgctt cagtgtggca tgggatacca tatccggggc acgccacatt 960 

ttattccttc tctgtcaagt cgtaaatcac ttggaagtca tacggttacc tgtatactac 
1020 

ccatcccaag aagagaaaga cgatcccaaa ctttatgcta gcaatgttcg gaaattaatg 
1080 

gccaccgagg gtaacttgat tctatcggag ttgggactta gcgacaaaag gatatatcac 
1140 

gcaactctca atggtaatct tagtcaaacc cgtgatttcc atcagaaaga agaatga 
1197 

<210> 6 
<211> 398 
<212> PRT 

<213> Arabidopsis sp. 
<400> 6 

Met Glu Ser Glu Leu Lys Asp Leu Asn Ser Asn Ser Asn Pro Pro Ser 
15 10 15 

Ser Lys Glu Asp Arg Pro Leu Leu Lys Ser Glu Ser Asp Leu Ala Ala 
20 25 , 30 

Ala lie Glu Glu Leu Asp Lys Lys Phe Ala Pro Tyr Ala Arg Thr Asp 
35 40 45 

Leu Tyr Gly Thr Met Gly Leu Gly Pro Phe Pro. Met Thr Glu Asn lie 
50 55 60 

Lys Leu Ala Val Ala Leu Val Thr Leu Val Pro Leu Arg Phe Leu Leu 
65 70 75 80 

Ser Met Ser lie Leu Leu Leu Tyr Tyr Leu lie Cys Arg Val Phe Thr 
85 90 95 

Leu Phe Ser Ala Pro Tyr Arg Gly Pro Glu Glu Glu Glu Asp Glu Gly 
100 105 110 

Gly Val Val Phe Gin Glu Asp Tyr Ala His Met Glu Gly Trp Lys Arg 
115 120 125 

Thr Val lie Val Arg Ser Gly Arg Phe Leu Ser Arg Val Leu Leu Phe 
130 135 140 

Val Phe Gly Phe Tyr Trp He His Glu Ser Cys Pro Asp Arg Asp Ser 
145 150 155 160 

Asp Met Asp Ser Asn Pro Lys Thr Thr Ser Thr Glu He Asn Gin Lys 
165 170 175 

Gly Glu Ala Ala Thr Glu Glu Pro Glu Arg Pro Gly Ala He Val Ser 
180 185 190 

Asn His Val Ser Tyr Leu Asp He Leu Tyr His Met Ser Ala Ser Phe 
195 200 205 

Pro Ser Phe Val Ala Lys Arg Ser Val Gly Lys Leu Pro Leu Val Gly 
210 215 220 

Leu He Ser Lys Cys Leu Gly Cys Val Tyr Val Gin Arg Glu Ala Lys 
225 230 235 240 

Ser Pro Asp Phe Lys Gly Val Ser Gly Thr Val Asn Glu Arg Val Arg 
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Glu Ala His 



Gly Thr Thr 
275 

Phe Leu Ala 
290 

Glu Arg Phe 
305 

Leu Phe Leu 



Pro Val Tyr 



Ala Ser Asn 
355 

Ser Glu Leu 
370 

Gly Asn Leu 
385 



245 

Ser Asn 
2 60 

Thr Asn 

Gly Thr 

Ser Val 

Leu Cys 
325 

Tyr Pro 
340 

Val Arg 
Gly Leu 
Ser Gin 



Lys Ser 

Gly Asp 

Pro Val 
295 

Ala Trp 
310 

Gin Val 
Ser Gin 
Lys Leu 



Ser Asp 
375 

Thr Arg 
390 



250 

Ala Pro Thr 
265 

Tyr Leu Leu 
280 

Leu Pro Val 



Asp Thr lie 



Val Asn His 
330 

Glu Glu Lys 
345 

Met Ala Thr 
360 

Lys Arg lie 



Asp Phe His 



lie Met 

Thr Phe 

lie Leu 
300 

Ser Gly 
315 

Leu Glu 
Asp Asp 
Glu Gly 



255 

Leu Phe Pro Glu 
270 

Lys Thr Gly Ala 
285 

Lys Tyr Pro Tyr 



Tyr His 
380 

Gin Lys 
395 



Ala Arg His lie 
320 

Val lie Arg Leu 
335 

Pro Lys Leu Tyr 
350 

Asn Leu lie Leu 
365 

Ala Thr Leu Asn 



Glu Glu 



<210> 7 
<211> 1131 
<212> DNA 

<213> Arabidopsis sp. 



<400> 7 

atgagcagta 

aacatcgaag 

ctgcgtgatt 

gactcgttca 

ttattcccac 

tgcttcactt 

ttgctgaaag 

tgcagctttt 

atccgtccta 

gagcagatga 

caaagcacaa 

cgtgaaattg 

ctcatatttc 

gcttttgaat 

gacgccttct 

tcatgggctg 

acaggaattg 

1020 

aaggtccctt 
1080 

aagcaacaga 
1131 



cggcagggag 
attaccttcc 
tgctagacat 
caagatgttt 
tatactgctt 
tagcttttgg. 
gtcaagatag 
ttgtcgcctc 
agcaggtcta 
ccgcatttgc 
tattagagag 
tagcaaaaaa 
ccgaagggac 
tggactgcac 
ggaatagcag 
ttgtatgtga 
aatttgcaga 

gggatggata 

gtttcgcaga 



gctcgtgact 
ttc.tggttct 
ctctccaacg 
caaatcaaat 
tggggttgtt 
gtggattatt 
gttgaggaaa 
atggaccgga 
tgttgccaac 
tgttataatg 
tgtgggatgt 
gttaagggac 
atgtgtaaat 
tgtttgtcca 
aaaacaatca 
agtgtggtac 
gagggtcaga 



tcaaaatccg 
tccatcaatg 
ctcactgaag 
cctccagaac 
gttagatact 
ttcctttcat 
aagatagaga 
gttgtcaaat 
catacttcaa 
cagaagcatc 
atctggttca 
catgtccaag 
aataattaca 
attgcaatta 
tttactatgc 
ttggaaccac 
gacatgatat 



agcttgacct 
aacctcgcgg 
ctgctggtgc 
cttggaactg 
gtatcctctt 
tgtttatccc 
gggtcttggt 
atcacgggcc 
tgattgattt 
ctggttgggt 
atcgttcaga 
gagctgacag 
cagtgatgtt 
aatacaacaa 
acttgctgca 
aaaccataag 
ctcttcgggc 



cgatcaccct 
caagctcagc 
cattgttgat 
gaatatttac 
tcccttgagg 
tgtaaatgcg 
ggaaatgatt 
acgtcctagc 
catcgtattg 
tggtcttctg 
ggcaaaggat 
taatcctctt 
taagaagggt 
gatttttgtt 
actcatgaca 
gcccggtgaa 
gggtctcaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



cttgaagtat tcgagaccaa gccccaagca tagtgaacgc 
gtcgatcctg gctagattgg aagagaagtg a 



<210> 8 
<211> 376 
<212> PRT 

<213> Arabidopsis sp . 
<400> 8 

Met Ser Ser Thr Ala Gly Arg Leu Val Thr Ser Lys Ser Glu Leu Asp 
1 5 .10 15 

Leu Asp His Pro Asn lie Glu Asp Tyr Leu Pro Ser Gly Ser Ser lie 
20 25 30 
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Asn Glu Pro Arg Gly Lys Leu Ser Leu Arg Asp Leu Leu Asp lie Ser 
35 40 45 

Pro Thr Leu Thr Glu Ala Ala Gly Ala lie Val Asp Asp Ser Phe Thr 
50 55 60 

Arg Cys Phe Lys Ser Asn Pro Pro Glu Pro Trp Asn Trp. Asn lie Tyr 
65 70 75 80 

Leu Phe Pro Leu Tyr Cys Phe Gly Val Val Val Arg Tyr Cys lie Leu 
85 90 95 

Phe Pro Leu Arg Cys Phe Thr Leu Ala Phe Gly Trp lie lie Phe Leu 
100 105 110 

Ser Leu Phe lie Pro Val Asn Ala Leu Leu Lys Gly Gin Asp Arg Leu 
115 120 125 

Arg Lys Lys lie Glu Arg Val Leu Val Glu Met lie Cys Ser Phe Phe 
130 135 140 

Val Ala Ser Trp Thr Gly Val Val Lys Tyr His Gly Pro Arg Pro Ser 
145 150 155 160 

lie Arg Pro Lys Gin Val Tyr Val Ala Asn His Thr Ser Met lie Asp 
165 170 175 

Phe lie Val Leu Glu Gin Met Thr Ala Phe Ala Val lie Met Gin Lys 
180 185 ; 190 

His Pro Gly Trp Val Gly Leu Leu Gin Ser Thr lie Leu Glu Ser Val 
195 200 205 

Gly Cys lie Trp Phe Asn Arg Ser Glu Ala Lys Asp Arg Glu lie Val 
210 ,f 215 220 

Ala Lys Lys Leu Arg Asp His Val Gin Gly Ala Asp Ser Asn Pro Leu 
225 230 235 240 

Leu lie Phe Pro Glu Gly Thr Cys Val Asn Asn Asn Tyr Thr Val Met 
245 250 255 

Phe Lys Lys Gly Ala Phe Glu Leu Asp Cys Thr Val Cys Pro lie Ala 
260 265 270 

lie Lys Tyr Asn Lys lie Phe Val Asp Ala Phe Trp Asn Ser Arg Lys 
275 280 285 

Gin Ser Phe Thr Met His Leu Leu Gin Leu Met Thr Ser Trp Ala Val 
290 295 300 

Val Cys Glu Val Trp Tyr Leu Glu Pro Gin Thr lie Arg Pro Gly Glu 
305 310 315 320 

Thr Gly lie Glu Phe Ala Glu Arg Val Arg Asp Met lie Ser Leu Arg 
325 330 335 

Ala Gly Leu Lys Lys Val Pro Trp Asp Gly Tyr Leu Lys Tyr Ser Arg 
340 345 350 

Pro Ser Pro Lys His Ser Glu Arg Lys Gin Gin Ser Phe Ala Glu Ser 
355 " 360 365 

lie Leu Ala Arg Leu Glu Glu Lys 
370 375 



<210> 9 
<211> 965 



WO 00/18889 



8 



PCT/US99/22231 



<212> DNA 

<213> Arabidopsis sp. 



<400> 9 

gttgttaagt 

tcgatcacag 

tgggatcatc 

tctaatggta 

gccatggctc 

cgacccattc 

aagaaagtgc 

aggagggaat 

tctatgtgta 

agagaccgag 

gatttaggtt 

tttcagatat 

tagtagtagg 

gatgtaaata 

taaatttgta 

ctatggaatt 

aaaaa 



tacaagtctc 
ctcgattttc 
aaactngtcg 
ccgtcgtgat 
gtcaattcca 
tccgttcttg 
ggttcgcgga 
tgaaccggaa 
gaatctctac 
atcacagagt 
ttgtaaatct 
tgtagacttt 
tggttttctt 
attgacatgt 
aaaacatagt 
tatattgatt 



ttcaaaaaca 
ctttattgtt 
gtaaggwaac 
cgcaaccgcc 
tggaaatcat 
tctatcttca 
taatgtgaaa 
aagcgtaccg 
catgccagcg 
tcaatattct 
ttcttttgtt 
gtagttgggt 
atgctceact 
aagtagtcat 
gtgcctattg 
gtgttgaaaa 



cacacacacg 
ccgttggttt 
ttcacggacg 
atggtttgct 
caaaatccta 
gaggaaacga 
gatacgaaag 
aagccagtga 
aaccggatgg 
tattgacttt 
tttcggtaat 
ggtcttcttt 
tatctactta 
tagaaatttg 
tacatataaa 
aacaaaaaaa 



tctctcttca 
tcttgagnat 
gatcttcaat 
caagcaccgc 
aggttcttga 
agaaacaggg 
gtaacgggga 
ctaaaccggg 
ctctgtacaa 
ttcttcttga 
attagatttt 
ttctcccttt 
cttgttttaa 
aaaaggcaaa 
ctctcttttg 
aaaaaaaaaa 



cagccaatca 
ttttctttct 
gttgagctgt 
tctgtttctc 
tcagactcta 
gaagaagata 
agagtaccgg 
aaagaccggt 
tgggattctt 
ttagtcaata 
ttcttggaaa 
ttgtgtctca 
atcaagtgat 
tgaaagaata 
ttggggatat 
aaaaaaaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

965 



<210> 10 
<211> 1593 
<212> DNA 

<213> Arabidopsis sp . 



<400> 10 

atgtccggta ataagatctc gactcttcaa gctcttgtct tcttcttgta ccggtttttc 60 
attctccgtc gttggtgtca tcgtagccct aaacaaaaat accaaaaatg cccttctcac 120 
ggcctccacc aatatcaaga cctatcgaat cacactittga tattcaacgt cgaaggagct 180 
ctactcaaat caaactcttt attcccttac ttcatggttg tggcattcga agccggaggg 240 
gtgataaggt cacttttcct cttagttctt tatccattta taagcttgat gagctacgaa 300 
atgggcttga agacgatggt gatgctgagc ttctttggag ttaaaaagga aagcttccga 360 
gtggggaaat cagttttgcc taagtatttt ctagaagatg ttgggctcga gatgttccag 420 
gttttgaaaa gagga'ggcaa gagagttgct gtgagtgatt taccacaagt tatgattgat 480 
gtattcttgc gagattactt ggagatagaa gttgtggtcig gaagagacat gaaaatggtc 540 
ggtggttact acctaggcat cgtggaggat aagaagaacc ttgaaattgc ttttgataaa 600 
gtggttcaag aagaaagact tggtagtggt cgtcgtctta ttggcatcac ttcctttaac 660 
tcgccaagtc acagatctct cttctctcaa ttttgccagg aaatttactt cgtcagaaat 72 0 
tcagacaaga aaagttggca aaccctacca caagatcaat accctaaacc attgattttc 780 
cacgatggtc gtttagccgt taagccaaca cctttaaaca cactcgtatt attcatgtgg 840 
gccccattcg ccgccgtctt agccgctgca agactcgtct tcggcctaaa cttaccttac 900 
tccctagcca atcccttcct cgccttttcc ggtatccacc ttactctcac cgtcaacaac 960 
cacaacgacc taatatccgc cgacagaaaa agaggttgtc tctttgtgtg taaccataga 
1020 

acgttattgg acccacttta catttcatac gctctaagaa agaaaaacat gaaagccgtg 
1080 

acgtatagtc taagcagatt atctgagctt ctggctccga tcaagaccgt tagattgact 
1140 

cgtgatcgag tcaaagatgg tcaagccatg gagaaattgc tgagccaggg agatctcgtg 
1200 

gtttgtccgg aagggactac gtgtagagag ccttacttgc ttcggtttag tccacttttc 
1260 

tctgaggttt gtgacgtcat cgtacctgtt gctattgact cacacgtgac tttcttctat 
1320 

ggcacgacgg ctagtggtct taaggcattt gatcccattt tcttcctttt gaatcctttc 
1380 

ccttcctaca ccgtcaaatt gcttgaccct gtctctggaa gtagctcgtc cacgtgtcga 
1440 

ggagtccctg acaatggaaa agttaacttc gaggtggcta atcacgtgca gcatgagatc 
1500 

gggaatgcct tggggtttga gtgcaccaac ctcacgagaa gagataagta cttgatcttg 
1560 

gccggtaata acggagttgt caagaaaaaa taa 
1593 

<210> 11 
<211> 530 
<212> PRT 
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<213> Arabidopsis sp. 
<400> 11 

Met Ser Gly Asn Lys lie Ser Thr Leu Gin Ala Leu Val Phe Phe Leu 
15 10 15 

Tyr Arg Phe Phe lie Leu Arg Arg Trp Cys His Arg Ser Pro Lys Gin 
20 25 30 

Lys Tyr Gin Lys Cys Pro Ser His Gly Leu His Gin Tyr Gin Asp Leu 
35 40 45 

Ser Asn His Thr Leu lie Phe Asn Val Glu Gly Ala Leu Leu Lys Ser 
50 55 60 

Asn Ser Leu Phe Pro Tyr Phe Met Val Val Ala Phe Glu Ala Gly Gly 
65 70 75 80 

Val lie Arg Ser Leu Phe Leu Leu Val Leu Tyr Pro Phe He Ser Leu 
85 90 95 

Met Ser Tyr Glu Met Gly Leu Lys Thr Met Val Met Leu Ser Phe Phe 
100 105 110 

Gly Val Lys Lys Glu Ser Phe Arg Val Gly Lys Ser Val Leu Pro Lys 
115 120 125 

Tyr Phe Leu Glu Asp Val Gly Leu Glu Met Phe Gin Val Leu Lys Arg 
130 135 140 

Gly Gly Lys Arg Val Ala Val Ser Asp Leu Pro Gin Val Met He Asp 
145 150 155 160 

Val Phe Leu Arg Asp Tyr Leu Glu He Glu Val Val Val Gly Arg Asp 
165 170 175 

Met Lys Met Val Gly Gly Tyr Tyr Leu Gly He Val Glu Asp Lys Lys 
180 185 190 

Asn Leu Glu He Ala Phe Asp Lys Val Val Gin Glu Glu Arg Leu Gly 
195 200 205 

Ser Gly Arg Arg Leu He Gly He Thr Ser Phe Asn Ser Pro Ser His 
210 215 220 

Arg Ser Leu Phe Ser Gin Phe Cys Gin Glu He Tyr Phe Val Arg Asn 
225 230 235 240 

Ser Asp Lys Lys Ser Trp Gin Thr Leu Pro Gin Asp Gin Tyr Pro Lys 
245 250 255 

Pro Leu He Phe His Asp Gly Arg Leu Ala Val Lys Pro Thr Pro Leu 
260 265 270 

Asn Thr Leu Val Leu Phe Met Trp Ala Pro Phe Ala Ala Val Leu Ala 
275 280 285 

Ala Ala Arg Leu Val Phe Gly Leu Asn Leu Pro Tyr Ser Leu Ala Asn 
290 295 300 

Pro Phe Leu Ala Phe Ser Gly He His Leu Thr Leu Thr Val Asn Asn 
305 310 315 320 

His Asn Asp Leu He Ser Ala Asp Arg Lys Arg Gly Cys Leu Phe Val 
325 330 335 

Cys Asn His Arg Thr Leu Leu Asp Pro Leu Tyr He Ser Tyr Ala Leu 
340 345 350 

Arg Lys Lys Asn Met Lys Ala Val Thr Tyr Ser Leu Ser Arg Leu Ser 
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355 360 365 

Glu Leu Leu Ala Pro He Lys Thr Val Arg Leu Thr Arg Asp Arg Val 
370 375 380 

Lys Asp Gly Gin Ala Met Glu Lys Leu Leu Ser Gin Gly Asp Leu Val 
385 390 395 400 

Val Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe 
405 410 415 

Ser Pro Leu Phe Ser Glu Val Cys Asp Val He Val Pro Val Ala He 
420 425 430 

Asp Ser His Val Thr Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys 
435 440 445 

Ala Phe Asp Pro He Phe Phe Leu Leu Asn Pro Phe Pro Ser Tyr Thr 
450 455 460 

Val Lys Leu Leu Asp Pro Val Ser Gly Ser Ser Ser Ser Thr Cys Arg 
465 470 475 480 

Gly Val Pro Asp Asn Gly Lys Val Asn Phe Glu Val Ala Asn His Val 
485 490 495 

Gin His Glu He Gly Asn Ala Leu Gly Phe Glu Cys Thr Asn Leu Thr 
500 505 510 

Arg Arg Asp Lys Tyr Leu He Leu Ala Gly Asn Asn Gly Val Val Lys 
515 520 525 

Lys Lys 
530 

<210> 12 
<211> 1509 
<212> DNA 

<213> Arabidopsis sp . 



<400> 12 

atggttatgg agcaagctgg aacgacatcg 
atactgaaga acgcagattc attctcttac 
ctaattcgtt tcgctatctt gttgtttcta 
agctacaaaa acgcagctct caagctcaag 
ccggagatcg aatcagtggc tagagccgtt 
atggacacgt ggagggtttt cagctcgtgt 
cgagttatgg tggagaggtt tgctaaggag 
gaactgattg taaaccggtt cggttttgtc 
cagtctgctt tgaaccgtgt cgctaatttg 
ggaaaaccgg ctttgaccgc ctctacaaat 
gcaccaatcc cggagaacta caaccacggt 
gtgatatttc acgacggaag actagtgaag 
ctcctttgga tcccatttgg aatcattctc 
ctcccattgt gggccacacc ttacgtctct 
ggaaagcctc ctcagccacc ggcggctgga 
agaaccctaa tggaccctgt ggtattatct 
acttactcaa tctcgcgctt atcagagatc 
1020 

agaatccgag atgtggatgc ggctaagatc 
1080 

gtttgtcctg agggaaccac ttgtcgtgaa 
1140 

gctgagttaa cggataggat tgttccggtt 
1200 

gcgactacag cgagaggctg gaagggtttg 
1260 

ccggtttacg agattacgtt cttgaaccag 
1320 



tattcggtcg tgtcagagtt tgaaggaaca 60 
ttcatgctcg tagccttcga agcagctggt 120 
tggcccgtaa tcacactcct tgacgttttc 180 
atttttgtag ccactgttgg tctacgtgaa 240 
ctgccaaaat tctacatgga cgacgtaagc 300 
aagaagaggg tcgtggtcac gagaatgcct 3 60 
catcttagag cagatgaggt catcggtacg 420 
accggtttga ttcgcgaaac ggatgttgat 480 
tttgttggtc ggaggcctca actaggtctt 540 
ttcttatcgt tatgtgagga gcatattcat 600 
gaccaacaac ttcagctacg tccacttccg 660 
cggccaacgc cggccaccgc tctcatcatc 720 
gccgtgatcc ggatctttct tggagccgtc 780 
cagatattcg gtggccatat catcgtcaaa 840 
aaatccggcg tgctctttgt gtgtactcac 900 
tatgtcctcg gacgtagcat cccagccgtt 960 
ttatctccca ttccaaccgt ccgattgaca 

aaacaacaac tgtcaaaagg agatctagtg 

ccgtttttgt taagattcag cgcgcttttc 

gcgatgaact acagagtcgg attcttccac 

gacccaattt tcttcttcat gaacccaaga 

cttcctatgg aggcaacatg ttcgtccggg 
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aagagcccgc atgacgtggc gaactatgtt cagagaatct tggcggctac gttagggttt 
1380 

gagtgcacca acttcacaag aaaagataag tatagggttc tcgctggaaa cgatggaaca 
1440 

gtgtcgtact tgtcgttgct agaccaattg aagaaggtgg ttagcacttt cgagccttgt 
1500 

ctccattga 
1509 

<210> 13 
<211> 502 
<212> PRT 

<213> Arabidopsis sp . 
<400> 13 

Met Val Met Glu Gin Ala Gly Thr Thr Ser Tyr Ser Val Val Ser Glu 
15 10 15 

Phe Glu Gly Thr lie Leu Lys Asn Ala Asp Ser Phe Ser Tyr Phe Met 
20 25 30 

Leu Val Ala Phe Glu Ala Ala Gly Leu He Arg Phe Ala He Leu Leu 
35 40 45 

Phe Leu Trp Pro Val He Thr Leu Leu Asp Val Phe Ser Tyr Lys Asn 
50 55 60 

Ala Ala Leu Lys Leu Lys He Phe Val Ala Thr Val Gly Leu Arg Glu 
65 70 75 80 

Pro Glu He Glu Ser Val Ala Arg Ala Val^ Leu Pro Lys Phe Tyr Met 
85 90 95 

Asp Asp Val Ser Met Asp Thr Trp Arg Val Phe Ser Ser Cys Lys Lys 
100 105 110 

Arg Val Val Val Thr Arg Met Pro Arg Val Met Val Glu Arg Phe Ala 
115 120 125 

Lys Glu His Leu Arg Ala Asp Glu Val He Gly Thr Glu Leu He Val 
130 135 140 

Asn Arg Phe Gly Phe Val Thr Gly Leu He Arg Glu Thr Asp Val Asp 
145 150 155 160 

Gin Ser Ala Leu Asn Arg Val Ala Asn Leu Phe Val Gly Arg Arg Pro 
165 170 175 

Gin Leu Gly Leu Gly Lys Pro Ala Leu Thr Ala Ser Thr Asn Phe Leu 
180 185 190 

Ser Leu Cys Glu Glu His He His Ala Pro He Pro Glu Asn Tyr Asn 
195 200 205 

His Gly Asp Gin Gin Leu Gin Leu Arg Pro Leu Pro Val He Phe His 
210 215 220 

Asp Gly Arg Leu Val Lys Arg Pro Thr Pro Ala Thr Ala Leu He He 
225 230 235 240 

Leu Leu Trp He Pro Phe Gly He He Leu Ala Val He Arg He Phe 
245 250 255 

Leu Gly Ala Val Leu Pro Leu Trp Ala Thr Pro Tyr Val Ser Gin He 
260 265 270 

Phe Gly Gly His He He Val Lys Gly Lys Pro Pro Gin Pro Pro Ala 
275 280 285 

Ala Gly Lys Ser Gly Val Leu Phe Val Cys Thr His Arg Thr Leu Met 
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290 295 300 

Asp Pro Val Val Leu Ser Tyr Val Leu Gly Arg Ser lie Pro Ala Val 
305 310 315 320 

Thr Tyr Ser lie Ser Arg Leu Ser Glu lie Leu Ser Pro lie Pro Thr. 

325 330 335 

Val Arg Leu Thr Arg lie Arg Asp Val Asp Ala Ala Lys lie Lys Gin 
340 345 350 

Gin Leu Ser Lys Gly Asp Leu Val Val Cys Pro Glu Gly Thr Thr Cys 
355 360 365 

Arg Glu Pro Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr 
370 .375 380 

Asp Arg lie Val Pro Val Ala Met Asn Tyr Arg Val Gly Phe Phe His 
385 390 395 400 

Ala Thr Thr Ala Arg Gly Trp Lys Gly Leu Asp Pro lie Phe Phe Phe 
405 410 415 

Met Asn Pro Arg Pro Val Tyr Glu lie Thr Phe Leu Asn Gin Leu Pro 
420 425 430 

Met Glu Ala Thr Cys Ser Ser Gly Lys Ser Pro His Asp Val Ala Asn 
435 440 445 

Tyr Val Gin Arg lie Leu Ala Ala Thr Leu Gly Phe Glu Cys Thr Asn 
450 455 - 460 

Phe Thr Arg Lys Asp Lys Tyr Arg Val Leu Ala Gly Asn Asp Gly Thr 
465 470 475 480 

Val Ser Tyr Leu, Ser Leu Leu Asp Gin Leu Lys Lys Val Val Ser Thr 
485 490 495 

Phe Glu Pro Cys Leu His 
500 

<210> 14 
<211> 1563 
<212> DNA 

<213> Arabidopsis sp. 
<400> 14 

atgtccgcca agatttcaat attccaagct cttgtctttc tattctaccg gtttatcctc 60 
cggcgatatc ggaactctaa accaaaatac caaaatggcc cttcttctct. cctccaatcc 120 
gacctatcac gccacacatt gatcttcaac gtagaaggag ctcttctcaa atccgactct 180 
ctcttccctt acttcatgtt agtagcattt gaggcgggag gcgtaataag gtcatttctc 240 
ctcttcattc tctatccatt gataagcttg atgagccatg agatgggtgt caaagtgatg 3 00 
gtaatggtga gcttcttcgg gatcaaaaaa gaaggttttc gagcggggag agcggttttg 3 60 
cctaaatact ttctagaaga tgtcggactc gagatcttcg aagtgttgaa gagaggaggg 420 
aagaaaatcg gagtgagtga tgatcttcct caagttatga tcgaagggtt cttgagagat 480 
tacttggaga ttgacgttgt ggtcgggaga gaaatgaaag tcgttggagg ttattatcta 540 
ggtatcatgg aggataaaac caaacatgat cttgtctttg atgagttagt tcgtaaagag 600 
agactaaaca ccggtcgtgt tattggcatc acttccttca atacatctct tcaccgatat 660 
ctattctctc agttttgcca ggaaatttat ttcgtgaaga aatcagacaa gcgaagctgg 720 
caaaccctac cacgaagcca gtaccctaaa ccattgattt tccatgatgg ccgtctcgcg 7 80 
atcaaaccaa ccctaatgaa cactttggtc ttgttcatgt ggggtccttt cgcagccgca 840 
gccgcagcag ccagactctt cgtctctctt tgcatccctt actctttatc aatcccgatc 900 
ctcgcctttt ccggttgcag actaaccgtc actaacgact acgtttcatc tcaaaaacaa 9 60 
aaaccaagtc aacgcaaagg ttgtctcttt gtatgtaacc ataggacttt attggaccct 
1020 

ctctatgttg cattcgcttt gagaaagaaa aacatcaaaa ctgtaacgta tagtttgagt 
1080 

agggtatctg agattttggc tccgatcaag acggtgagac tgacccgtga tcgggtgagc 
1140 
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gacggtcaag ccatggagaa attgttaacc gaaggagatc tcgttgtttg tcctgaagga 
1200 

accacttgta gagaacctta cctgcttagg tttagccctt tgttcaccga ggttagtgat 
1260 

gtcatcgttc ccgtggctgt gacggtacac gtgaccttct tctacggtac aacggcgagt 
1320 

ggtcttaagg cacttgaccc gcttttcttc ctcttggatc cttatcctac ctacaccatc 
1380 

caatttctcg accctgtctc cggtgccacg tgccaagatc ctgatggaaa gttgaagttt 
1440 

gaggtggcca acaatgttca gagtgatatt gggaaggcgc tggatttcga gtgcacaagt 
1500 

ctcactagaa aagacaagta tttgatcttg gccggtaata atggagtagt taagaaaaat 

1560 

taa 

1563 

<210> 15 
<211> 520 
<212> PRT 

<213> Arabidopsis sp. 
<400> 15 

Met Ser Ala Lys He Ser He Phe Gin Ala Leu Val Phe Leu Phe Tyr 
15 10 15 

Arg Phe He Leu Arg Arg Tyr Arg Asn Ser Lys Pro Lys Tyr Gin Asn 
20 25 30 

Gly Pro Ser Ser Leu Leu Gin Ser Asp Leu Ser Arg His Thr Leu He 
35 40 45 

Phe Asn Val Glu Gly Ala Leu Leu Lys Ser Asp Ser Leu Phe Pro Tyr 
50 55 60 

Phe Met Leu Val Ala Phe Glu Ala Gly Gly Val He Arg Ser Phe Leu 
65 70 75 80 

Leu Phe He Leu Tyr Pro Leu He Ser Leu Met Ser His Glu Met Gly 
85 90 95 

Val Lys Val Met Val Met Val Ser Phe Phe Gly He Lys Lys Glu Gly 
100 105 110 

Phe Arg Ala Gly Arg Ala Val Leu Pro Lys Tyr Phe Leu Glu Asp Val 
115 120 125 

Gly Leu Glu He Phe Glu Val Leu Lys Arg Gly Gly Lys Lys He Gly 
130 135 140 

Val Ser Asp Asp Leu Pro Gin Val Met He Glu Gly Phe Leu Arg Asp 
145 150 155 160 

Tyr Leu Glu He Asp Val Val Val Gly Arg Glu Met Lys Val Val Gly 
165 170 175 

Gly Tyr Tyr Leu Gly He Met Glu Asp Lys Thr Lys His Asp Leu Val 
180 185 190 

Phe Asp Glu Leu Val Arg Lys Glu Arg Leu Asn Thr Gly Arg Val He 
195 200 205 

Gly He Thr Ser Phe Asn Thr Ser Leu His Arg Tyr Leu Phe Ser Gin 
210 215 220 

Phe Cys Gin Glu He Tyr Phe Val Lys Lys Ser Asp Lys Arg Ser Trp 
225 230 235 240 

Gin Thr Leu Pro Arg Ser Gin Tyr Pro Lys Pro Leu He Phe His Asp 
245 °50 255 
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Gly Arg Leu Ala lie Lys Pro Thr Leu Met Asn Thr Leu Val Leu Phe 
260 265 270 

Met Trp Gly Pro Phe Ala Ala Ala Ala Ala Ala Ala Arg Leu Phe Val 
275 280 285 . 

Ser Leu Cys lie Pro Tyr Ser Leu Ser lie Pro lie Leu Ala Phe Ser 
290 ^ 295 300 

Gly Cys Arg Leu Thr Val Thr Asn. Asp Tyr Val Ser Ser Gin Lys Gin 
305 310 315 320 

Lys Pro Ser Gin Arg Lys Gly Cys Leu Phe Val Cys Asn His Arg Thr 
325 . 330 335 

Leu Leu Asp Pro Leu Tyr Val Ala Phe Ala Leu Arg Lys Lys Asn lie 
340 345 350 

Lys Thr Val Thr Tyr Ser Leu Ser Arg Val Ser Glu lie Leu Ala Pro 
355 * 360 365 

He Lys Thr Val Arg Leu Thr Arg Asp Arg Val Ser Asp Gly Gin Ala 
370 375 380 

Met Glu Lys Leu Leu Thr Glu Gly Asp Leu Val Val Cys Pro Glu Gly 
385 390 395 400 

Thr Thr Cys Arg Glu Pro Tyr Leu Leu Arg Phe Ser Pro Leu Phe Thr 
405 410, 415 

Glu Val Ser Asp Val He Val Pro Val Ala Val Thr Val His Val Thr 
420 425 430 

Phe Phe Tyr Gly Thr Thr Ala Ser Gly Leu Lys Ala Leu Asp Pro Leu 
435 440 445 

Phe Phe Leu Leu Asp Pro Tyr Pro Thr Tyr Thr He Gin Phe Leu Asp 
450 455 460 

Pro Val Ser Gly Ala Thr Cys Gin Asp Pro Asp Gly Lys Leu Lys Phe 
465 470 475 480 

Glu Val Ala Asn Asn Val Gin Ser Asp He Gly Lys Ala Leu Asp Phe 
485 490 495 

Glu Cys Thr Ser Leu Thr Arg Lys Asp Lys Tyr Leu He Leu Ala Gly 
500 505 510 

Asn Asn Gly Val Val Lys Lys Asn 
515 520 



<210> 16 
<211> 1506 
<212> DNA 

<213> Arabidopsis sp. 
<400> 16 

atgggagctc aggagaaacg gcgccgtttc 
cggtccaacc ataccgtggc cgctgatcta 
ttcccttact atttcctcgt agccctcgag 
cttgtgtccg taccattcgt ttatcttacg 
aacgtatttg tcttcatcac gttcgcgggt 
cgttccgtcc tcccgaggtt ctatgcggag 
aacacgttcg ggaaacggta cataataact 
gtgaaaacat tcctaggagt tgataaagtt 
ggtcgggcaa ccgggttcac cagaaaacca 
gtcgttttga gagagtttgg tggcctagcg 
agcaagacgg accacgactt catgtccatc 



gagcagatat caaagtgcga tgttaaggac 60 
gacggaacac tactaatctc tcgtagcgcc 120 
gcagggagct tgctccgagc gttgatccta 180 
tacttgacca tctccgagac tttagccatc 240 
ctcaagatcc gagacgttga gctagtggtc 3 00 
gacgtgaggc ccgatacctg gcgtatcttc 3 60 
gcgagccctc gaattatggt cgagccattc 420 
cttggaacag agctagaggt ctccaaatcg 480 
ggtattctcg tcggtcagta caaacgtgac 540 
tctgatttac ctgatttggg gctcggcgat 600 
tgcaaggaag gttacatggt gccacgtacg 660 
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aaatgcgaac cattaccaag aaacaaactc ttaagcccca taatattcca cgagggcaga 72 0 
ttagtccaac gcccaacgcc gttagttgct ctgttaactt tcctctggct tcccgtcggt 780 
ttcgtcctct ctatcatccg cgtctacacg aatattccgt taccggaacg tatcgcccgt 840 
tacaactaca agcttactgg catcaagcta gtcgtcaacg gccaccctcc tccgccgcca 900 
aaacctggcc agccaggcca tcttttggtc tgcaaccacc gcaccgttct cgatcctgtg 960 
gtcacagctg tcgcactcgg ccggaaaatc agctgcgtca cttacagcat cagcaagttc 
1020 

tctgagctaa tctcaccaat caaagccgtt gcgttgactc gtcaacgtga gaaagacgca 
1080 

gcgaacatca agcgtctttt ggaggaaggc gatctcgtga tatgtcccga gggaaccacg 
1140 

tgccgtgagc ctttccttct ccggtttagt gctcttttcg ctgagctcac ggaccggatc 
1200 

gttcccgtgg cgatcaacac aaagcagagc atgttcaatg gtaccaccac acgtggatac 
1260 

aagcttcttg atccttactt tgcgttcatg aacccgaggc cgacgtatga gatcacgttc 
1320 

ctcaaacaga ttccagctga gctgacgtgt aaaggaggca aatctccgat agaggttgcg 
1380 

aattacatac agagggtttt gggaggaacc ttaggttttg agtgcaccaa tttcacaaga 
144 0 

aaggataagt acgcaatgct tgctggtact gacggtaggg ttccggtgaa gaaggagaag 

1500 

acgtga 

1506 

<210> 17 
<211> 500 
<212> PRT 

<213> Arabidopsis sp . 
<400> 17 

Met Gly Ala Gin Glu Lys Arg Arg Arg Phe Glu Gin lie Ser Lys Cys 
15 10 15 

Asp Val Lys Asp Arg Ser Asn His Thr Val Ala Ala Asp Leu Asp Gly 
20 25 30 

Thr Leu Leu lie Ser Arg Ser Ala Phe Pro Tyr Tyr Phe Leu Val Ala 
35 40 45 

Leu Glu Ala Gly Ser Leu Leu Arg Ala Leu He Leu Leu Val Ser Val 
50 55 60 

Pro Phe Val Tyr Leu Thr Tyr Leu Thr He Ser Glu Thr Leu Ala He 
65 70 75 80 

Asn Val Phe Val Phe lie Thr Phe Ala Gly Leu Lys He Arg Asp Val 
85 90 95 

Glu Leu Val Val Arg Ser Val Leu Pro Arg Phe Tyr Ala Glu Asp Val 
100 105 110 

Arg Pro Asp Thr Trp Arg He Phe Asn Thr Phe Gly Lys Arg Tyr He 
115 120 125 

He Thr Ala Ser Pro Arg He Met Val Glu Pro Phe Val Lys Thr Phe 
130 135 140 

Leu Gly Val Asp Lys Val Leu Gly Thr Glu Leu Glu Val Ser Lys Ser 
145 150 155 160 

Gly Arg Ala Thr Gly Phe Thr Arg Lys Pro Gly He Leu Val Gly Gin 
165 170 175 

Tyr Lys Arg Asp Val Val Leu Arg Glu Phe Gly Gly Leu Ala Ser Asp 
180 185 190 

Leu Pro Asp Leu Gly Leu Gly Asp Ser Lys Thr Asp His Asp Phe Met 
195 200 205 
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Ser lie Cys Lys Glu Gly Tyr Met Val Pro Arg Thr Lys Cys Glu Pro 
210 215 220 

Leu Pro Arg Asn Lys Leu Leu Ser Pro lie lie Phe His Glu Gly Arg 
225 230 235 240 

Leu Val Gin Arg Pro Thr Pro Leu Val Ala Leu Leu Thr Phe Leu Trp 
245 250 255 

Leu Pro Val Gly Phe Val Leu Ser lie lie Arg Val Tyr Thr Asn lie 
260 265 270 

Pro Leu Pro Glu Arg lie Ala Arg Tyr Asn Tyr Lys Leu Thr Gly lie 
275 280 285 

Lys Leu Val Val Asn Gly His Pro Pro Pro Pro Pro Lys Pro Gly Gin 
290 295 300 

Pro Gly His Leu Leu Val Cys Asn His Arg Thr Val Leu Asp Pro Val 
305 310 315 320 

Val Thr Ala Val Ala Leu Gly Arg Lys lie Ser Cys Val Thr Tyr Ser 
325 330 335 

lie Ser Lys Phe Ser Glu Leu lie Ser Pro lie Lys Ala Val Ala Leu 
340 345 350 

Thr Arg Gin Arg Glu Lys Asp Ala Ala Asn lie Lys Arg Leu Leu Glu 
355 360 365 

Glu Gly Asp Leu Val lie Cys Pro Glu Gly Thr Thr Cys Arg Glu Pro 
370 375 380 

Phe Leu Leu Arg Phe Ser Ala Leu Phe Ala Glu Leu Thr Asp Arg lie 
385 390 395 400 

Val Pro Val Ala He Asn Thr Lys Gin Ser Met Phe Asn Gly Thr Thr 
405 410 415 

Thr Arg Gly Tyr Lys Leu Leu Asp Pro Tyr Phe Ala Phe Met Asn Pro 
420 42,5 430 

Arg Pro Thr Tyr Glu He Thr Phe Leu Lys Gin He Pro Ala Glu Leu 
435 " "440 445 

Thr Cys Lys Gly Gly Lys Ser Pro He Glu Val Ala Asn Tyr He Gin 
450 455 460 

Arg Val Leu Gly Gly Thr Leu Gly Phe Glu Cys Thr Asn Phe Thr Arg 
465 470 475 480 

Lys Asp Lys Tyr Ala Met Leu Ala Gly Thr Asp Gly Arg Val Pro Val 
485 490 495 

Lys Lys Glu Lys 
500 



<210> 18 
<211> 1620 
<212> DNA 

<213> Arabidopsis sp. 
<400> 18 

atggcggatc ctgatctgtc ttctcctttg 
gttgttatct ctatcgccga cgacgacgac 
gttgttgacc ctcgtgtttc acgaggtttt 
ctcagcgagt cagagcctcc ggttctcggt 
acacctggag ttagcggatt gtacgaagcg 



atccaccatc aatcctccga tcaacctgaa 60 
gacgagtcag gactcaatct tcttccagcc 120 
gagtttgacc atcttaatcc ttatggcttt 180 
ccgacgacgg tggatccatt ccggaacaat 240 
attaagctcg tgatttgtct tccgattgct 300 
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ctgattagac ttgttctctt tgctgctagc ttagctgttg gttacttggc tacaaaattg 3 60 
gcacttgctg gctggaaaga taaagagaac cctatgcctc tttggagatg cagaatcatg 420 
tggattactc ggatctgtac cagatgtatc ctcttctctt ttggctatca gtggataaga 480 
aggaaaggga aacctgctcg gagagagatt gctccgattg ttgtatcaaa tcatgtttct 540 
tatattgaac caatcttcta cttctatgaa ttatcaccga ccattgttgc atcggagtca 600 
catgattcac ttccatttgt tggaactatt atcagggcaa tgcaggtgat atatgtgaat 660 
agattctcac agacatcaag gaagaatgct gtgcatgaaa taaagagaaa agcttcctgc 720 
gatagatttc ctcgtctgct gttattcccc gaaggaacca cgactaatgg gaaagttctt 780 
at t tec t tec aactcggtgc tttcatccct ggttacccta ttcaacctgt agtagtcegg 840 
tatccccatg tacattttga tcaatcctgg ggaaatatct ctttgttgac gctcatgttt 900 
agaatgttca ctcagtttca caatttcatg gaggttgaat atcttcctgt aatctatccc 960 
agtgaaaagc aaaagcagaa tgctgtgcgt ctctcacaga agactagtca tgeaattgea 
1020 

acatctttga atgtcgtcca aacatcccat tettttgegg acttgatget actcaacaaa 
1080 

gcaactgagt taaagctgga gaacccctca aattacatgg ttgaaatggc aagagttgag 
1140 

tcgctattcc atgtaagcag cttagaggca aegegatttt tggatacatt tgtttccatg 
1200 

attceggact cgagtggacg tgttaggcta catgactttc tteggggtet taaactgaaa 
12 60 

ccttgccctc tttctaaaag gatatttgag ttcatcgatg tggagaaggt eggatcaate 
1320 

actttcaaac agttcttgtt tgcctcgggc cacgtgttga cacagccgct ttttaagcaa 
1380 

acatgegage tagectttte ecattgegat gcagatggag atggctatat tacaattcaa 
1440 

gaacteggag aagctctcaa aaacacaatc ccaaacttga acaaggacga gattcgagga 
1500 

atgtaccatt tgetagaega cgaccaagat caaagaatca gecaaaatga cttgttgtcc 
1560 

tgcttaagaa gaaaccctct tctcatagcc atetttgeae ctgacttggc cccaacataa 
1620 

<210> 19 
<211> 539 
<212> PRT 

<213> Arabidopsis sp . 
<400> 19 

Met Ala Asp Pro Asp Leu Ser Ser Pro Leu lie His His Gin Ser Ser 
15 10 15 

Asp Gin Pro Glu Val Val lie Ser lie Ala Asp Asp Asp Asp Asp Glu 
20 25 30 

Ser Gly Leu Asn Leu Leu Pro Ala Val Val Asp Pro Arg Val Ser Arg 
35 40 45 

Gly Phe Glu Phe Asp His Leu Asn Pro Tyr Gly Phe Leu Ser Glu Ser 
50 55 60 

Glu Pro Pro Val Leu Gly Pro Thr Thr Val Asp Pro Phe Arg Asn Asn 
65 70 75 80 

Thr Pro Gly Val Ser Gly Leu Tyr Glu Ala lie Lys Leu Val lie Cys 
85 90 95 

Leu Pro lie Ala Leu lie Arg Leu Val Leu Phe Ala Ala Ser Leu Ala 
100 105 110 

Val Gly Tyr Leu Ala Thr Lys Leu Ala Leu Ala Gly Trp Lys Asp Lys 
115 120 125 

Glu Asn Pro Met Pro Leu Trp Arg Cys Arg lie Met Trp lie Thr Arg 
130 135 140 



lie Cys Thr Arg Cys lie Leu Phe Ser Phe Gly Tyr Gin Trp lie Arg 
145 150 155 160 
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Arg Lys Gly Lys Pro Ala Arg Arg Glu lie Ala Pro He Val Val Ser 
165 170 175 

Asn His Val Ser Tyr He Glu Pro He Phe Tyr Phe Tyr Glu Leu Ser 
180 185 190 

Pro Thr He Val Ala Ser Glu Ser His Asp Ser Leu Pro Phe Val Gly 
195 200 205 

Thr He He Arg Ala Met Gin Val He Tyr Val Asn Arg Phe Ser Gin 
210 215 220 

Thr Ser Arg Lys Asn Ala Val His Glu lie Lys Arg Lys Ala Ser Cys 
225 . 230 235 240 

Asp Arg Phe Pro Arg Leu Leu Leu Phe Pro Glu Gly Thr Thr Thr Asn 
245 250 255 

Gly Lys Val Leu . He Ser Phe Gin Leu Gly Ala Phe He Pro Gly Tyr 
260 265 270 

Pro He Gin Pro Val Val Val Arg Tyr Pro His Val His Phe Asp Gin 
275 280 285 

Ser Trp Gly Asn He Ser Leu Leu Thr Leu Met Phe Arg Met Phe Thr 
290 295 300 

Gin Phe His Asn Phe Met Glu Val Glu Tyr Leu Pro Val He Tyr Pro 
305 310 315 320 

Ser Glu Lys Gin Lys Gin Asn Ala Val Arg' Leu Ser Gin Lys Thr Ser 
325 ' 330 335 

His Ala He Ala Thr Ser Leu Asn Val Val Gin. Thr Ser His Ser Phe 
340 345 -350 

Ala Asp Leu Met Leu Leu Asn Lys Ala Thr Glu Leu Lys Leu Glu Asn 
355 360 365 ■ 

Pro Ser Asn Tyr Met Val Glu Met Ala Arg Val Glu Ser Leu Phe His 
370 " 375 380 

Val Ser Ser Leu Glu Ala Thr Arg Phe Leu Asp Thr Phe Val Ser Met 
385 390 395 400 

He Pro Asp Ser Ser Gly Arg Val Arg Leu His Asp Phe Leu Arg Gly 
405 410 415 

Leu Lys Leu Lys Pro Cys Pro Leu Ser Lys Arg He Phe Glu Phe He 
420 425 430 

Asp Val Glu Lys Val Gly Ser He- Thr Phe Lys Gin Phe Leu Phe Ala 
435 440 445 

Ser Gly His Val Leu Thr Gin Pro Leu Phe Lys Gin Thr Cys Glu Leu 
450 455 460 

Ala Phe Ser His Cys Asp Ala Asp Gly Asp Gly Tyr He Thr lie -Gin 
465 470 475 480 

Glu Leu Gly Glu Ala Leu Lys Asn Thr He Pro Asn Leu Asn Lys Asp 
485 490 495 

Glu He Arg Gly Met Tyr His Leu Leu Asp Asp Asp Gin Asp Gin Arg 
500 505 510 

He Ser Gin Asn Asp Leu Leu Ser Cys Leu Arg Arg Asn Pro Leu Leu 
515 520 525 

He Ala He Phe Ala Pro Asp Leu Ala Pro Thr 
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<210> 20 
<211> 1128 
<212> DNA 

<213> Arabidopsis sp . 



<400> 20 

atggaaaaaa 

ataatatgtc 

ttatcagctg 

ttctttggct 

gttatcttct 

cgaacagaag 

aatatcaaat 

cacctctttg 

cagatagttt 

ggcacagatt 

cttccgatac 

gaactgagtt 

ccatctttct 

cgtatcaacc 

acattccagc 

gaaggaacag 

gccttcacca 

1020 

tatgtctctt 
1080 

ccacttgttg 
1128 



agagtgtacc 
tgatggtgtt 
tagtgttgag 
cgtggctcgc 
ctggtgataa 
ttgattggat 
atgtgcttaa 
agtttattcc 
cgagttttaa 
acacagaggc 
tgaacaacgt 
gctcacttga 
tagacaacgt 
tgacccaaat 
tcaaagacca 
agaaagagtt 
ccatctgtac 



aaattctgat 
agtttcaaca 
gcttttcagc 
cttgtggcct 
ggttccttgc 
gtacttctgg 
gagtagtttg 
tgttgagagg 
ggatccccga 
taaatgccaa 
gctgcttccc 
cgcagtttat 
ttatggaatt 
cccaaatcaa 
gctgctcaat 
caacacaaag 
acatctcacc 



aagttgtctc 
gcttttatga 
attcgctata 
ttcctctttg 
gaggatcgag 
gatcttgcac 
atgaaattac 
agatgggaag 
gacgctttat 
aggagtaaga 
aggacaaaag 
gatgtgacca 
gagccatcag 
gaaaaggaca 
gacttttact 
aagtacctca 
ttcttctcat 



tgattagagt 
tgttgatatt 
gccgtaaatg 
agaagattaa 
tattgctcat 
tgcgtaaagg 
ctctctttgg 
tcgatgaagc 
ggcttgctct 
aatttgctgc 
gtttcgtctc 
tcggttataa 
aagttcacat 
tcaatgcttg 
ccaatggtca 
taaactgttt 
caatgatttg 



gttaagaggt 
ctgggggttc 
tgtttccttc 
caaaaccaaa 
tgcaaaccac 
ccagattggg 
ttgggcgttt 
aaacttgaga 
tttccccgag 
tgaaaatggc 
ctgcttgcaa 
aacccgctgc 
ccacatccgt 
gttaatgaac 
tttccctaac 
ggcagtgatt 
gttcaggatt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



tggcctgtgt ctacttgacc tctgctacgc atttcaatct tcgttctgtt 
agactgcaaa aaattccctc aaattagtaa acaaataa 



<210> 21 
<211> 375 
<212> PRT 

<213> Arabidopsis sp . 



<400> 21 

Met Glu Lys Lys Ser 
1 5 

Val Leu Arg Gly lie 
20 

Met Met Leu lie Phe 
35 

Phe Ser lie Arg Tyr 
50 

Trp Leu Ala Leu Trp 
65 

Val lie Phe Ser Gly 
85 



Val Pro Asn Ser Asp Lys 
10 

lie Cys Leu Met Val Leu 
25 

Trp Gly Phe Leu Ser Ala 
40 

Ser Arg Lys Cys Val Ser 
55 

Pro Phe Leu Phe Glu Lys 
70 75 

Asp Lys Val Pro Cys Glu 
90 



Leu Ser Leu lie Arg 
15 

Val Ser Thr Ala Phe 
30 

Val Val Leu Arg Leu 
45 

Phe Phe Phe Gly Ser 
60 

lie Asn Lys Thr Lys 
80 

Asp Arg Val Leu Leu 
95 



lie Ala Asn His Arg Thr Glu Val Asp Trp Met Tyr Phe Trp Asp Leu 
100 105 110 

Ala Leu Arg Lys Gly Gin lie Gly Asn lie Lys Tyr Val Leu Lys Ser 
115 120 125 

Ser Leu Met Lys Leu Pro Leu Phe Gly Trp Ala Phe His Leu Phe Glu 
130 - 135 140 

Phe lie Pro Val Glu Arg Arg Trp Glu Val Asp Glu Ala Asn Leu Arg 
145 150 155 160 



Gin lie Val Ser Ser Phe Lys Asp Pro Arg Asp Ala Leu Trp Leu Ala 
165 170 175 
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Leu Phe 

Lys Lys 

Leu Pro 
210 

Ser Leu 
225 

Pro Ser 
lie His 
Asp lie 



Leu Asn 
290 

Lys Glu 
305 



Pro Glu 
180 

Phe Ala 
195 

Arg Thr 

Asp Ala 

Phe Leu 

lie Arg 
260 

Asn Ala 
275 

Asp Phe 
Phe Asn 



Ala Phe Thr Thr 
Trp Phe 
Thr His 



Arg lie 
340 



Phe Asn 
355 



Gly 

Ala 

Lys 

Val 

Asp 
245 

Arg 
Trp 
Tyr 
Thr 



He 
325 

Tyr 



Leu 



Ser Leu 
370 



Lys Leu Val 



Thr Asp Tyr Thr 
185 

Glu Asn Gly Leu 

200 . 

Gly Phe Val Ser 
215 

Tyr Asp Val Thr 
230 

Asn Val Tyr Gly 



He Asn Leu Thr 
265 

Leu Met Asn Thr 
280 

Ser Asn Gly His 
295 

Lys Lys Tyr Leu 
310 

Cys Thr His Leu 



Val Ser Leu Ala 
345 

Arg Ser Val Pro 
360 

Asn Lys 
375 



Glu Ala Lys 

Pro He Leu 

Cys Leu Gin 
220 

He Gly Tyr 
235 

He Glu Pro 
250 

Gin He Pro 

Phe Gin Leu 

Phe Pro Asn 
300 

He Asn Cys 
315 

Thr Phe Phe 
330 

Cys Val Tyr 
Leu Val Glu 



Cys Gin Arg Ser 
190 

Asn Asn Val Leu 
205 

Glu Leu Ser Cys 



Lys Thr Arg Cys 
240 

Ser Glu Val His 
255 

Asn Gin Glu Lys 
270 

Lys Asp Gin Leu 
285 

Glu Gly Thr Glu 



Leu Ala Val He 
320 

Ser Ser Met He 
335 

Leu Thr Ser Ala 
350 

Thr Ala Lys Asn 
365 



<210> 22 
<211> 1170 
<212> DNA 

<213> Arabidopsis sp . 



<400> 22 

atggtgattg 

gctgtcaatc 

tacagaaaaa 

gactggtggg 

ggcaaagaac 

tggattctgg 

tccaaattcc 

agaaattggg 

cctcgacctt 

aaagccgcac 

cctcgcacca 

tatgatatga 

aaaggacaac 

gaatcagatg 

ttagacaaac 

cccataaagt 

aagttcctac 

1020 

ggtctaggta 
1080 

tcgaccccag 
1140 



ctgcagctgt 
tctttcaggc 
ttaaccgggt 
ctggagttaa 
atgctcttgt 
ctcagcggtc 
ttccagtcat 
ccaaggatga 
tctggttagc 
aagagtatgc 
aaggtttcgt 
cagtgactat 
cttcagtggt 
acgcaattgc 
acatagctgc 
cccttgcggt 
actgggcaca 



catcgtgcct 
agtttgctat 
ggttgcagaa 
gatccaagtg 
cgtttgtaat 
aggttgcctg 
aggctggtca 
aagcactcta 
cctttttgtg 
agcctcctct 
gtcagctgtt 
tccaaaaacc 
gcatgttcac 
acagtggtgc 
agacactttc 
ggttctatca 
actcttttct 



ttgggccttc 
gtactcattc 
accttgtggt 
tttgctgata 
caccgaagtg 
ggaagcgcat 
atgtggttct 
aagtcaggtc 
gagggaactc 
gaattgccta 
agtaatatgc 
tctccaccac 
atcaagtgtc 
agagatcagt 
cccggtcaac 
tgggcatgcg 
tcatggaaag 



tcttcttcat 
gaccactgtc 
tggagcttgt 
atgagacctt 
atattgattg 
tagctgtaat 
cggagtatct 
ttcagcgctt 
gctttacaga 
tccctcgaaa 
gttcatttgt 
ccacgatgct 
actcgatgaa 
ttgtggctaa 
aagaacagaa 
tactaactct 
gtatcacgat 



atctggtctc 
taagaacaca 
atggatagtt 
caatcgaatg 
gcttgtggga 
gaagaagtct 
ctttctggaa 
gagcgacttc 
agccaaactt 
tgtgttgatt 
cccagcaatt 
aagactattc 
agacttacct 
ggatgctctg 
cattggccgt 
tggagcaata 
atcggcgctt 



60 

120 

180 

240 

300 

360 

420 

4 80 

540 

600 

660 

720 

780 

840 

900 

960 



tcatcactct ctgtatgcag atcctgatac gctcgtctca gtcagagcgt 
ccaaagtcgt cccagccaag ccaaaagaca atcaccaccc agaatcatcc 
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tcccaaacag aaacggagaa ggagaagtaa 
1170 

<210> 23 
<211> 389 
<212> PRT 

<213> Arabidopsis sp. 
<400> 23 

Met Val lie Ala Ala Ala Val lie Val Pro Leu Gly Leu Leu Phe Phe 
15 10 15 

lie Ser Gly Leu Ala Val Asn Leu Phe Gin Ala Val Cys Tyr Val Leu 
20 25 30 

lie Arg Pro Leu Ser Lys Asn Thr Tyr Arg Lys lie Asn Arg Val Val 
35 40 45 

Ala Glu Thr Leu Trp Leu Glu Leu Val Trp lie Val Asp Trp Trp Ala 
50 55 60 

Gly Val Lys lie Gin Val Phe Ala Asp Asn Glu Thr Phe Asn Arg Met 
65 70 75 80 

Gly Lys Glu His Ala Leu Val Val Cys Asn His Arg Ser Asp lie Asp 
85 90 95 

Trp Leu Val Gly Trp He Leu Ala Gin Arg Ser Gly Cys Leu Gly Ser 
100 105 , 110 

Ala Leu Ala Val Met Lys Lys Ser Ser Lys* Phe Leu Pro Val He Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu- Glu Arg Asn Trp Ala 
130 135 ' 140 

Lys Asp Glu Ser Thr Leu Lys Ser Gly Leu Glri Arg Leu Ser Asp Phe 
145 150 155 160 

Pro Arg Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Glu Ala Lys Leu Lys Ala Ala Gin Glu Tyr Ala Ala Ser Ser Glu Leu 
180 185 190 

Pro He Pro Arg Asn Val Leu He Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser Asn Met Arg Ser Phe Val Pro Ala lie Tyr Asp Met Thr 
210 215 220 

Val Thr He Pro Lys Thr Ser Pro Pro Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Val Val His Val His He Lys Cys His Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ser Asp Asp Ala He Ala Gin Trp Cys Arg Asp 
260 265 270 

Gin Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His He Ala Ala Asp 
275 280 285 

Thr Phe Pro Gly Gin Gin Glu Gin Asn He Gly Arg Pro He Lys Ser 
290 295 300 

Leu Ala Val Val Leu Ser Trp Ala Cys Val Leu Thr Leu Gly Ala He 
305 310 315 320 

Lys Phe Leu His Trp Ala Gin Leu Phe Ser Ser Trp Lys Gly He Thr 
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lie Ser Ala Leu 
340 

lie Arg Ser Ser 
355 

Ala Lys Pro Lys 
370 

Thr Glu Lys Glu 
385 



Gly Leu Gly He 

Gin Ser Glu Arg 
360 

Asp Asn His His 
375 

Lys 



He Thr Leu Cys 
345 

Ser Thr Pro Ala 



Pro Glu Ser Ser 
380 



Met Gin He Leu 
350 

Lys Val Val Pro 
365 

Ser Gin Thr Glu . 



<210> 24 

<211> 269 

<212> DNA 

<213> Glycine max 

<400> 24 

gacccactga acgctctcat caccttcacg tggctcccct tcggcttcat cctctccatc 60 

ataagggtct acttcaacct ccctctccca gaacncattg tccgctacac ctacgagatg 120 

ctcggcatca acctcgtcat ccgcggccac cgccctcctc cgccttcccc cggcaccccc 180 

ggcaacctct acgtctgcaa ccaccgcacc gctctcgacc ccatcgtcat cgccattgcc 240 
ctcggccgca aggtctcctg cgtcaccta 269 

<210> 25 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 25 

tgatcttcca cgacggccgt ttcgtgcaga ggccagaccc actgaacgct ctcatcacct 60 
tcacgtggct ccccttcggc ttcatcctct ccatcataag ggtctacttc aaccttcctc 120 
tcccagaacg cattgtccgc tacacctacg agatgctcgg catcaacctc gtcatccgcg 180 
gccaccgccc tcctccgcct tcccccggca cccccggcaa cctctacgtc tgcaaccacc 240 
gc 242 

<210> 26 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 26 

gtttgttcaa aggccaactc ctctagcagc cctcttgacc ttcctatggt tgccaattgg 60 
catcatactc tccatnctta agggtctacc ttaacatccc tttgcctgaa agaattgctt 120 
ggtataacta taagctatta ggaatcagag ttattgtgaa gggtacccct ccaccacccc 180 
caaagaaggg tcaaagtggt gtcctatttg tttgtaacca ccgcacagtt ttagaccctg 240 
tggttactgc agttgcactt ggaagaaaaa tt 272 

<210> 27 

<211> 218 

<212> DNA 

<213> Glycine max 

<400> 27 

atagcacagg agggttacat ggtgcctccg agcaaatcag caaaggcagt cccacaggag 60 

cgtctgaaga gcagaatgat cttccacgac gggcgtttcg tgcagaggcc agacccaatg 120 

aatgccctca tcaccttcac atggctccct ttgggtttcg tcctctccat cataagggtc 180 

tacttcaacc tccctctccc agaacgcatc gtccgcta 218 

<210> 28 

<211> 270 

<212> DNA 

<213> Glycine max 

<400> 28 

gtgcctgttg ctgtgaactg caagcagaac atgttctttg gaaccaccgt tcgtggcgtc 60 
aagttctggg acccttaact tacttcttac atgaacccta ggcctgtgta cgaggttacc 120 
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ttaccttgat acctttgccg aggagatgtc ggttaaggct ggggggaagt cgtccattga 180 
ggtggccaac cacgtggcag aaggtgctgg gggatgtgtt agggtttgag tgcaccgggt 240 
tgactaggaa ggataagtat atgttgttgg 270 

<210> 29 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 29 

catgagggta ggtttgctca aaggccaact. cctctagctg ccctcttgac cttcctatgg 60 
ctgccaattg gcatcatact ctccatctta agggtctacc ttaacatccc tttgcctgaa 120 
agaattgttg gtacaactac aagctcttag gaatcagagt tattgtgaag ggtacccctc 180 
caccgccccc aaagaagggt caaagtggtg tctatttgtt tgtaaccacc gcacagtatt 240 
agaccctgtt gt 252 

<210> 30 

<211> 272 

<212> DNA 

<213> Glycine max 

<400> 30 

ctgggactgc cttaaacgat gcatggatct tatcaagaaa ggagcctctg tttttttctt 60 
tccagaggga acacgcagta aagatggaag actaggcaca ttcaagaagg gtgctttcag 120 
tgttgctgca aagacaaatg caccagtagt accaattacc cttattggaa ctggtcaaat 180 
catgcctgca ggaaaggagg gaatagtgaa cataggttct gtgaaagtgg ttatacataa 240 
acctattgtt ggaaaggatc ctgacatgtt at 272 

<210> 31 

<211> 239 , ' 

<212> DNA 

<213> Glycine max 

<400> 31 

cgggaatcaa ggtcatcaga cttcaagggt gtttcagctg ttgtcactga cagaattcga 60 
gaagctcatc agaatgagtc tgctccatta atgatgttat ttccagaagg tacaaccaca 12 0 
aatggagagt tcctccttcc attcaagact ggtggttttt . tggcaaaggc accggtactt 180 
cctgtgatat tacgatatca ttaccagaga tttagccctg cctgggattc catatctgg 23 9 

<210> 32 

<211> 242 

<212> DNA 

<213> Glycine max 

<400> 32 

gaacggcaac ggcaacagcg ttcgcgatga ccgtcctctg ctgaagccgg agcctccggt 60 
cttccgccga cagcatcgcc gatatggaga agaagttcgc cgcttacgtc cgccgctacg 120 
tgtacggcac catgggacgc ggcgagttgc ctcccaagga gaagctcttg ctcggtttcg 180 
cgttggtcac tcttctcccc attcgagtcg ttctcgccgt caccatattg ctcttttatt 240 
ac 242 



<210> 33 

<211> 248 

<212> DNA 

<213> Glycine max 

<400> 33 

ttcttcttct ctcactctct aaaaccctaa ctctatacat ggaagggaaa nctcaaatct 60 
natgactaat taattaatcc atcgatcaag catggagtcc gaactcaaag acctcaattc 120 
gaagccgccg aacggcaacg gcaacagcgt tcgcgatgac cgtcctctgc tgaagccgga 180 
gcctccggtc tccgccgaca gcatcgccga tatggagaag aagttcgccg cttacgtccg 240 
ccgcgacg 248 

<210> 34 

<211> 217 

<212> DNA 

<213> Glycine max 



<400> 34 

aaaaccctaa ttctatacat ggaagggaaa tctcaaatct aatgactaat taattaatcc 60 
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atcgatcaag catggagtcc gaactcaaag acctcaattc gaagccgccg aacggcaacg 12 0 
gcaacagcgt tcgcgatgac cgtcctctgc tgaagccgga gcctccggtc tccgccgaca 180 
gcatcgccga tatggagaag aagttcgccg cttacgt 217 

<210> 35 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 35 

atctctgtct ctgcatttcc ctccctaaaa ccctaattct acatttggaa aggaaatctc 60 
aaatctaatg actaattaat caatcaatcg tattaataat ccatcgatca agtatggagt 120 
ccgaactcaa agacctcaat tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt 180 
gcgacgaccg tcctctgctg aagccggagc ctccggcctc ctccgacagc atcgccgaga 240 
tggagaagaa gttcgcc 257 

<210> 36 

<211> 284 

<212> DNA 

<213> Glycine max 

<400> 36 

cccgaccaaa acaggttttt gtggccaatc atacttccat gattgatttc attatcttag 60 
aacagatgac tgcatttgct gttattatgc agaagcatcc tggatgggtt ggattattgc 120 
agagcaccat tntggagagt gtagggtgta tctggttcaa ccgtacagag gcaaaggatc 180 
gagaagttgt ggcaaggaaa ttgagggatc atgtcctggg agctaacaac aaccctcttc 240 
ttatatttcc tgaaggaact tgtgtaaata atcactactc gtca 284 

<210> 37 

<211> 246 

<212> DNA 

<213> Glycine max 

<400> 37 

ggagatccgc ataagcaaat caatcatcct gttccttcct tatctctgtc tctgcatttc 60 
cctccctaaa accctaattc tacatttgga aaggaantct caaatctaat gataattaat 120 
caatcaatcg tattaataat ccatcgatca agtatggagt ccgaactcaa agacctcaat 180 
tcgaagccac ccaactgcaa cggcaacgcc aacagcgttt gcgacgaccg tcctctgctg 240 
aagccg 246 

<210> 38 

<211> 278 

<212> DNA 

<213> Glycine max 

<400> 38 

gttttctatt gccacgttgt ggaagcgtaa cgaagatgaa tggcattggg aaactcaaat 60 
cgtcgagttc tgaattggac cttcacattg aagattacct accttctgga tccagtgttc 120 
aacaagaacg gcatggcaag ctccgactgt gtgatttgct agacatttct cctagtctat 180 
ctgaggcagc acgtgccatt gtagatgata cattcacaag gtgcttcaag caaatcctcc 240 
agaaccttgg aactggaatg tttatttgtt tcctttgt 278 

<210> 39 

<211> 312 

<212> DNA 

<213> Glycine max 



<400> 39 

ttaactttgg 

cagaggtctt 

aagnatcatg 

tcatgattga 

atcctggatg 

acttgcgtct 



cacattctcc 
tggtaganat 
gacccaggcc 
tntcattatn 
ggttggtaag 
tc 



ttttgttcat 
gatgtgcagt 
tagcaggaga 
tnagaacaga 
cntacagnat 



caatgtgtgt 
ttctgtggtg 
ccaaagcagg 
tgactgcttt 
gtcaacngtg 



tgtaaattgt 
catcttggac 
tttttgtagc 
tgcngttatn 
tatnaaatat 



ncatttcctt 60 
tgnggntgtt 120 
caaccatact 180 
atgcagaagc 240 
gntacacnnn 3 00 
312 



<210> 40 

<211> 255 

<212> DNA 

<213> Glycine max 
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<400> 40 

ggattattgn ngcanatgca gtcatctgtt ctaagataat ganatcnatc atggaagtat 60 
gattggncac anaaacctgt yttttggttg gatactaggt cttggcccat ggtacttgac 12 0 
naccccagtc catgatgcaa canaganact gnacatcatc tccaccaaac ccctctgana 180 
ganacgagaa ttgagcaatt tagagtacct tggtttgatg caagtcagta tattcaagtt 240 
tctattcatc aaagg 255 

<210> 41 

<211> 291 

<212> DNA 

<213> Glycine max 



<400> 41 

caacctccca 

tcgcgtaaca 

tcacattgaa 

ccgcctgtgt 

agatgataca 



tgcaatcgct caccctctcc gtcacctgaa tctgttttct attccctccg 60 

aggatgaatg gcattgggaa actcaaatcg tcgagttctg aattggacct 120 

gattacctgc cttctggatc cagtgttcaa caagaacggc at^gcaagct 180 

gatttgctag acatttctcc tagtctatct gaggcagcac gtgccattgt 240 

ttcacaaggt gcttcaagtc aaatcctcca gaaccttgga a 291 



<210> 42 

<211> 284 

<212> DNA 

<213> Glycine max 

<400> 42 

ctgcaaccta ccatgcaatt cctcacctga atccgttttc tattgccacg ttgtggaagc 60 
gtaacgaaga tgaatggcat tgggaaactc aaatcgtcga gttctgaatt ggaccttcac 120 
attgaagatt acctaccttc tggatccagt gttcaacaag aacggcatgg caagctccga 180 
ctgtgtgatt tgctagacat ttctcctagt ctatctgagg cagcacgtgc catgtagatg 240 
atacatcaca aggtgctcaa gtcaaatctc cagaaccttg gaat 284 

<210> 43 

<211> 268 

<212> DNA 

<213> Glycine max 

<400> 43 

ctgaagtatt ctcgtcctag cccaaagcat agagaaaggn agcaacagaa ctttgctgag 60 
tcagtgctgc ggcgatggga ggaaaagtga tgtgtacctt tatgtggtgt tgttcttaat 12 0 
tattcttagt aatgccattg cttcgacccc tttttttgct tttgttttgt cattgctaac 180 
tatttatttt taacactttt attaaagata tggcatatat ncacttcagt anacaaagtt 240 
gtnccagtaa tttnttttcc aaaaaaaa 26 8 

<210> 44 

<211> 241 

<212> DNA 

<213> Glycine max 

<400> 44 

gancaaaatt gccctccatc actttccttg ttagagttgg tttctgcnac ctaccatgca 60 
attccctcac ctgaatccgt tttctattgc cacgttgtgg aagcgtaacg aagatgaatg 120 
gcattgggaa actcaaatcg tcgagttctg aattggacct tcacattgaa gattacctac 180 
cttctggatc cagtgttcaa caagaacggc atggcaagct ccgactgtgt gatttgctag 240 
a 241 

<210> 45 

<211> 247 

<212> DNA 

<213> Glycine max 



<400> 45 

gtaggatgtc tgagatcctt gccccaatca aaacggtgcg gttaactaga aaccgcgacg 60 
aggatgcgaa aatgatgaaa aatttgctgg ggcaagggga cctggtggtt tgtcctgaag 120 
ggaccacatg tagagaacct tatttattga ggttcagccc tctgttctca gagatgtgcg 180 
atgagattgt ccccgttggc agttgattcc cagttatatg ttccacggaa ccactgctgg 240 



tgganta 



247 



<210> 46 
<211> 271 
<212> DNA 
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<400> 46 

tgcagggggg cttgttagag ccatagtttt 
aggaaaagag atggggttga agataatggt 
gagcttcaga gttggaaggt ccgttttgcc 
aatgtttgag gcactcaaaa aaggagggaa 
gatggtggaa agcttcttga gagagtattt 

<210> 47 

<211> 242 

<212> DNA 

<213> Glycine max 



ggttcttcta tacccttttg tttgtgtcgt 60 
catggcatgc ttcttcggga tcaaagcatc 120 
cnaattcttc tnggaggacg ttngtgcaga 180 
gacagtggga gttaccaatt taccccacgt 240 
g 271 



<400> 47 

ttcacagctg tcacgccgtn aacggaaaat ggcaacggcg agacgcagtt tcccgcctat 60 
caccgaatgc aacggaacga cnccgtgcga ntctgtngnc gccgacctcg agggtacgct 120 
cctcatctcc cgtngctcgt tcccgtactt catgctcgtc gccgtcgaag ccggcagcnt 180 
cctccgcggc ctcatgctnc tcctctccct tccgttcgtc atnatcgcct acctcttcat 240 
ct 242 

<210> 48 

<211> 244 

<212> DNA 

<213> Glycine max 

<400> 48 

acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 
ctctctctct gtcatggtca ttggaggagc cttccctcgt ttcgacccaa tcaccaaatg 120 
tagacccaag accgctccaa ccagaccatc gcctcggacc tcgatggcac cctccttgtc 180 
tcccggagtg ccttccccta ctacttcctc gtcgccctcg aagccggcag cgtcttccga 240 
gcct 244 

<210> 49 

<211> 230 

<212> DNA 

<213> Glycine max 

<400> 49 

caacattcca cctagctccc caatcacatc ttcaccacac cataaacctt cttaatttct 60 
ctcttcattt tctcctctat tgtcataatc atggggacct tccctcgctt cgacccaatc 120 
accacccaag accggtccaa ccagaccgtg gcctccgacc ttgacggcac cctcctcgtc 180 
tcccggagcg ccttccccta ctacctcctc gttgccctcg aagccggcag 230 

<210> 50 

<211> 265 

<212> DNA 

<213> Glycine max 



<400> 50 

ctggtgaata atcctaagtt atggagtctg tggtgtgtga gctagaaggc acgcttgtga 60 
aggacaagga tgcgttctca tacttcatgt tggttgcgtt tgaagcttca ggtttggttc 120 
gtttcgcctt gttgctaaca ctattgcccg tgattcggtt ccttgacatg gttggcatga 180 
acgatgcatc tctcaagcta ntnatcttcg tggctgtggc tggtgttcca aagtccgaga 240 
ttgaatcagt ggctagggca gtttt 265 



<210> 51 

<211> 252 

<212> DNA 

<213> Glycine max 

<400> 51 

ctggtgaata atcctaagtt atggagtctg 
aggacaagga tgcgttctca tacttcatgt 
gtttcgcctt gttgctaaca ctattgcccg 
acgatgcatc tctcaagcta atgatcttcg 
tgaatcagtg gc 



tggtgtgtga gctagaaggc acgcttgtga 60 
tggttgcgtt tgaagcttca ggtttggttc 120 
tgattcggtt ccttgacatg gttggcatga 180 
tggctgtggc tgggttccaa agtccgagat 240 

252 



<210> 52 
<211> 218 
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<212> DNA 

<213> Glycine max 

<400> 52 

aactgcaact acaacaacat tcattcattc acagctgtca cgccgtgaac ggaaaatggc 60 
aacggcgaga cgcagtttac ccgcctatac accgaatgca acggaacgac accgtgcgag 120 
tctgtggccg ccgacctcga cggtacgctc ctcatntccc gtagctcgtt cccgtacttc 180 
atgctcgtcg ccgtcgaagc cggcagcctc ctccgcgg 218 

<210> 53 

<211> 262 

<212> DNA 

<213> Glycine max 

<400> 53 

ggttaaggac attgagatgg tcgnntcctc ggtgctgccc aagttctaca ccgaggacgt 60 
gcnccccgag agctggagag tcttcaatcc ttcgggaagc gttacattgt cactgctagt 120 
ctagggtgat ggtggagcan tttgttaaga cgtttcttgg ggctgataag gtgcttggga 180 
ctgagcttga ggccacgaaa tcggggaggt tcatgggttt gttaaggagc ctggtgtgct 240 
tgttggggag cacaagaaag tg 262 



<210> 54 

<211> 212 

<212> DNA 

<213> Glycine max 

<400> 54 

gcaactacaa caacattcat tcattcacag ctgtcacgcc gtgaacggaa aatggcaacg 60 
gcgagacgca gtttcccgcc tatcaccgaa tgcaacggaa cgacgccgtg cgagtctgtg 120 
gccgccgacc tcgacggtac gctcctcatc tcccgtagnc cgttcccgta cttcatgctc 180 
gtngccgtcg aagccggcag cctcctccgc gg < 212 

<210> 55 

<211> 273 

<212> DNA 

<213> Glycine max 



<400> 55 

catggttttc ttgagcttct 
tctggcaaag ttcttcttag 
tgagagaaaa gtggcatcta 
ctatttaggg gttgatgctg 
gggagttttt gagagtaaga 

<210> 56 

<211> 257 

<212> DNA 

<213> Glycine max 



ttggcctcag aaaggacaca 
aagatgttgg attggaaggc 
gtaagttgcc aagggtcatg 
ttatagcaag agaattgaag 
agccaattaa aat 



ttcagaacag gatcagctgt 60 
tttgaggccg taatatgttg 12 0 
gttgaaaatt tcctcaagga 180 
tcctttagtg gcttcttttt 240 

273 



<400> 56 

ctctcaaaaa aggagggaag acagtgggag 
gcttcttgag agagtatttg gacattgatt 
gtggatacta cgtaggattg atggatgaca 
aagaaggaaa aggatgctcc gacatgatcg 
atgatgattt tttctcc 

<210> 57 

<211> 240 

<212> DNA 

<213> Glycine max 



tcaccaatct accccatgtg atggtggaaa 60 
tcgttgtggg cagggagctg aaagttttct 120 
caaaaactat gcatgccttg gagctggtta 180 
gaatcacaag gtttcgcaac atacgcgacc 240 

257 



<400> 57 

gaactaagtg tgaaccacta 
gtaggtttgc tcaaaggcca 
ttggcatcat actctccatc 
cttggtacaa ctacaagctc 



ccaagaaaca agcttttaag 
actcctctag ctgnnctctt 
ttaagggtct accttaacat 
ttaggaatca gagttattgt 



tccaattatt tttcatgagg 60 
gaccttccta tggctgccaa 120 
ccctttgcct gaaagaattg 180 
gaagggtacc cctccaccgc 240 



<210> 58 
<211> 254 
<212> DNA 
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<213> Glycine max 
<400> 58 

cttggaataa gggtcattag gaagggtatc cctccacccc cagcnaagaa gggccaaagt 60 
ggagtcctat ttgtatgcaa ccacaggaca gttttagacc ctgtggttac agctgttgca 120 
ttaggaagga aaattagctg tgtcacatat agcataagca aattcactga aataatttca 180 
ccaatcaaag ctgtggcact ctctagggag agggacaaag atgctgccaa catcaagang 240 
ttgcttgagg. aagg 2 54 

<210> 59 

<211> 267 

<212> DNA 

<213> Glycine max 

<400> 59 

gccaganaga cttgcttggt acaactacaa gcttcttgga ataagggtca ttaggaaggg 60 
tatccctcca cccccagcaa agaagggcca aagtggagtc ctatttgtat gcaaccacag 120 
gacagtttta gaccctgtgg ttacagctgt tgcattagga aggaaaatta gctgtgtcac 180 
atatagcata agcaaattca ctgaaataat tcaccaatca aagctgtggc actctctagg 240 
gagagggacc nagatgctgc cnacatc 2 67 

<210> 60 

<211> 261 

<212> DNA 

<213> Glycine max 



<400> 60 

gtaaccacag ggtctaaaac 
tgcttatgct atatgtgaca 
gcactctcaa ggganngaga 
gacttggtga tttgccctga 
cactatttgc tgaactcact 



tgtgcggtgg ttactgcagt 
cagctaattc actgnaataa 
gaaagatgct gccaatatcc 
aggcacaact tgtagagagc 

g 



tgcacttgnc nagaaaaatt 60 
tttcaccaat taaagctgtg 120 
ngagactact tgaggaaggg 180 
cttcctcttg aggttcagtg 240 

261 



<210> 61 

<211> 258 

<212> DNA 

<213> Glycine max 

<400> 61 

caaggagctc acatgcagtg gagggaaatc agctattgaa gttgcaaact acattcaaag 60 
ggttcttgca gggactttgg gatttgagtg cacaaatttg actaggaaga gcaaatatgc 120 
catgcttgca ggcacagatg . ggacagttcc atctaaggag aaggcttgan aagggagaga 180 
aattaagttc tcccttttga ttattctgta ttggtgccca atgtgtttcc aaaacactta 240 
gaattatgat agaaataa 258 

<210> 62 

<211> 258 

<212> DNA 

<213> Glycine max 



<400> 62 

attggcataa tcctctccat cctaagggtc tatctcaaca tccctctgcc agaaagactt 60 

gcttgntaca actacaagct tcttggaata agggtcatta ggaagggtat ccctccaccc 120 

ccagcaaaga agggccaaag tggagcctat ttgtatgcaa ccacaggaca gttttagacc 180 

ctgtggttac agctgttgca ttaggaagga aaattagctg tgtcacatat agcataagca 240 
aattcactga aataattt 258 

<210> 63 

<211> 239 

<212> DNA 

<213> Glycine max 

<400> 63 

cacttcacca ccacaccaca accctaccct ctctctctgt catggtcatt ggaggagcct 60 
tccctcgttt cgacccaatc accaaatgta gcacccaaga ccgctccaac cagaccatcg 120 
cctcggacct cgatggcacc ctccttgtct cccggagtgc cttcccctac tacttcctcg 180 
tcgccctcga agccggcagc gtcttccgag ccctccttct cttaaccttc gtccccttc 239 



<210> 64 
<211> 531 
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<212> DNA 

<213> Glycine max 



<400> 64 

ccgagaaccg 

ccagcgcatt 

ttgtcctcct 

ggccatcaag 

ggtcgcgtgc 

acctatacac 

ggagccttcc 

accatcgcct 

ttcctcgtcg 



gtctaaccaa 
tccttactac 
tgcctccgtc 
tccctgatct 
tcggtgctgc 
ttcaccacca 
ctcgtttcga 
cggacctcga 
ccctcgaagc 



accgtggcct 
atgctggtcg 
cctttcgtgt 
tcatcgcctt 
ccaagttcta 
caccacaacc 
cccaatcacc 
tggcaccctc 
cggcagcgtc 



cggacttgga 
ccatcgaagc 
attcacgtac 
cgcgggcctg 
cgccgacata 
ctaccctctc 
aaatgtagca 
cttgtctccc 
ttccgagccc 



cggcaccctc 
cggcagcttc 
atattcctct 
aaggtcaggg 
ttcttcagtt 
tctctgtcat 
cccaagaccg 
ggagtgcctt 
tccttctctt 



ctggtgtccc 60 
ctccgtggcc 120 
ccgagaccgc 180 
acgttgagat 240 
agctccccca 3 00 
ggtcattgga 3 60 
ctccaaccag 420 
cccctactac 480 
a 531 



<210> 65 

<211> 256 

<212> DNA 

<213> Glycine max 

<400> 65 

acatattctt cagttagctc ccccaaccta tacacttcac caccacacca caaccctacc 60 
ctctctctct gtcatggtca ttggaggagc cttccctcgt ttcgacccaa tcaccaaatg 120 
tagcacccaa gaccgctcca accagaccat cgcctcggac ctcgatggca ccctccttgt 180 
ctcccggagt gccttcccct actacttcct cgtcgccctc gaagccggca gcgtcttccg 240 
agccctcctt ctctta 256 

<210> 66 

<211> 260 

<212> DNA 

<213> Glycine max 



<400> 66 

ccatccaaca tattcttcag ttagctcccc 
ccctaccctc tctctctgtc atggtcattg 
ccaaatgtag cacccaagac cgctccaacc 
tccttgtctc ccggagtgcc ttcccctact 
tcttccgagc cctccttctc 

<210> 67 

<211> 248 

<212> DNA 

<213> Glycine max 



caacctatac acttcaccac cacaccacaa 60 
gaggagcctt ccctcgtttc gacccaatca 120 
agactatcgc ctcggacctc gatggcaccc 180 
acttcctcgt cgccctcgaa gccggcagcg 240 

2 60 



<400> 67 

caccaaccaa acctcactct ccctttctcc cctgaccctc tccctgccat ggtcatggga 60 
gcctttggcc acttcgaacc ggtctccaaa tgcagcaccg agaaccggtc taaccaaacc 12 0 
gtggcctcgg acttggacgg caccctcctg gtgtccccca gcgcatttcc ttactacatg 180 
ctgggcgcca tcgaagccgg cagcttcctc cgtggccttg tcctccttgc ctccgtccct 240 
ttcgtgta 24 8 

<210> 68 

<211> 283 

<212> DNA 

<213> Glycine max 



<400> 68 

ttcttcccca ccatcacacc 
ttccgccact tcgaaccggt 
gcctcggact tggacggcac 
gtcgccatcg aagccggcag 
gtgtacttca cgtacatatt 

<210> 69 
<211> 258 
<212> DNA 
<213> Glycine max 

<400> 69 

ctcttcttcc ccaccatcnn accaaccaaa cctcactctc cctgaccatg gtcatgggag 60 
cctttcgcca cttcgaaccg gtttccaaat gcagcaccga aaaccggttt aaccaaaccg 120 



aancaaacct cactctncct ggccatggtc atgnnngcct 60 
ttccaaatgc agcaccgaaa accggtttaa ccaaaccgtg 120 
cctcctggtg tcccctagcg cctttcctta ctacatgctc 180 
cttcctccgt ggccttgtcc tccttggatc cgtccctttc 240 
cttctccgag accgcggcca tea 2 83 
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tggcctcgga cttggacggc accctcctgg tgtcccctag cgcctttcct tactacatgc 180 
tcgtcgccat cgaagccggc agcttcctcc gtggccttgt cctccttgga tccgtccctt 240 
tcgtgtactt cacgtaca 258 

<210> 70 

<211> 256 

<212> DNA 

<213> Glycine max 

<400> 70 

tgcaactaca acaacattca ttcattcaca gctgtcacgc cgtgaacgga aaatggcaac 60 

ggcgagacgc agtttcccgc ctatcaccga atgcaacgga acgacaccgt gcgagtctgt 120 

ggccgccgac ctcgacggta cgctcctcat ctcccgtagc tcgttcccgt acttcatgct 180 

cgtcgccgtc gaagccggca gcntcctccg cggcctcatc ctcctcctng ccantccgtt 240 

cgtcatcanc gcctac 256 

<210> 71 

<211> 259 

<212> DNA 

<213> Glycine max 

<400> 71 

cttccccacc atcacaccan ggcnaacctc antctccctt tctccacnga ccctctccct 60 
gccatngtca tgggancctt tggccacttc gaaccggtct ccaaatgcag caccgagaac 12 0 
cggnctaacc aaaccgtggc ctcggacttg gacggcaccc tcctggtgtc ccncagcgca 180 
tttccttact acatgctggc ngccatcgaa gccggcagct tcctccgtgg ccttgtcctc 240 
cttgcctccg tccctttcg 2 59 

<210> 72 , 

<211> 249 

<212> DNA 

<213> Glycine max 

<400> 72 

ccaacatatt cttcagttag ctcccccaac ctatacactt caccaccaca ccacaaccct 60 

accctctctc tctgt'catgg tcattggagg agccttccct cgtttcgacc caatcaccaa 120 

atgtagcacc caagaccgct ccaaccagac catcgcctc§ gacctcgatg gcaccctnct 180 

tgtctcccgg agtgccttcc cctactactt cctcgtcgcc ctcgaagccg gcagcgtctt 240 



ncgagccct 



249 



<210> 73 

<211> 257 

<212> DNA 

<213> Glycine max 

<400> 73 

caaccctctt cttccccacc atcacaccaa ncaaacctca ctctcccttt ctcccctgac 60 
cctctccctg ccatggtcat gggagccttt ggccacttcg aaccggtctc caaatgcagc 120 
accgagaacc ggtctaacca aaccgtggcc tcggacttgg acggcaccct cctggtgtcc 180 
cccagcgcat ntccttacta catgctggtc gccatcgaag ccggcagctt cctccgtggc 240 
cttgtcctcc ttgcctg 257 



<210> 74 

<211> 255 

<212> DNA 

<213> Glycine max 

<400> 74 

gccgaagacg tgcacccgga gagttggaga 
gtcacggcta gtcctagggt gatggtggag 
aaggtgcttg ggactgaact tgaggccacc 
aagcctggtg tgcttgttgg ggagcataag 
aattacctga cttgg 



gtgttcaact ctttcgggaa gcgttacatt 60 
ccgtttgtta aggcgtttct cggggctgac 12 0 
aaatcgggga cgttcactgg gtttgttaag 180 
aaagtggctc tggtgaagga gtttcagggt 240 

255 



<210> 75 

<211> 244 

<212> DNA 

<213> Glycine max 



<400> 75 
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caacaacatt cattcattca cagctgtcac gccgtgaacg gaaaatggca acggcgagac 60 
gcagtttccc gcctatcacc gaatgcaacg gaacgacacc gtgcgagtct gtggccgccg 120 
acctcgacgg tacgctcctc atcncccgta gctcgttccc gtacttcatg ctcgtcgccg 180 
tcgaagccgg cagcctcctc cgcggcctca tgcnttcctg ggtttanttt gagnacccct 240 
gagg 244 



<210> 76 

<211> 240 

<212> DNA 

<213> Glycine max 



<400> 76 

gctggctacc ctcttcttcc 
ggtcatggga gcctttncgc 
ttnaccanac cgtggcctcg 
cttactacat gctcgtcgcc 



ccaccatcac accaatcaaa 
cacttcgaac cggtttccaa 
gncttggacg gcaccctcct 
atcgaagccg gcagcttcct 



cctcactcta ccctggccat 60 
atgcagcacc gaanaccggt 120 
ggtgtcccct agcgcctttc 180 
ccgtggcttg tcctccttgg 240 



<210> 77 

<211> 263 

<212> DNA 

<213> Glycine max 

<400> 77 

gtttctcggg gctgacaagg tgcttgggac tgaacttgag gccaccaaat cggggacgtt 60 
cactgggttt gttaagaagc ctggtgtgct tgttggggag cataagaaag tggctctggt 120 
gaaggagttt cagggtaatt tacctgactt gggtctaggt gatagtaaaa gtgattatga 180 
cttcatgtca atttgcaagg aagggtacat ggtgccaaga actaagtgtg aaccactacc 240 
aagaaacaag cttttaagtc caa 2 63 

<210> 78 

<211> 258 

<212> DNA 

<213> Glycine max 

<400> 78 

ggccacgaaa tcggggaggt tcactgggtt tgttaaggag cctggtgtgc ttgttgggga 6 0 
gcacaagaaa gtggctgttg tgaaggagtt tcagggtaat ttacctgact tgggactagg 120 
agatagtaaa agtgattatg acttcatgtc aatttgcaag gaagggtaca tggtgccaag 180 
gactaagtgt gaaccactac caagaaacaa acttttaagt ccaattattt ntcatgaggg 240 
taggtttgtt caaaggcc 258 



<210> 79 

<211> 260 

<212> DNA 

<213> Glycine max 



accaancaaa cctcactctc cctttctccc ctgaccctct 60 
cctttggcca cttcgaaccg gtctccaaat gcagcaccga 120 
tggcctcgga cttggacggc accctcctgg tgtcccccag 180 
tggtcgccat cgaagccggc agcttcctcc gtgggccttg 240 

260 

<210> 80 

<211> 257 

<212> DNA 

<213> Glycine max 



<400> 79 
ctcttcttcc 
ccctgccatg 
gaaccggtct 
cgcatttcct 
tcctccttgc 



ccaccatcac 
gtcatgggag 
aaccaaaccg 
tactacatgc 
ctccgtccct 



<400> 80 

gggaacaaca acaaatggca 
atacccaatc cagcctgtaa 
tcatgtntct ttgggaaagc 
ggtagaatat cttcctgtca 
ggagaggact agccggg 



ngaaccttat ctccttccaa 
ttgtacgcta tcctcatgtg 
ttatgttcag aatgttcact 
tttatcccct ggatgataag 



cttggtgcat ttatccctgg 60 
cactttgacc aatcctgggg 120 
caatttcaca acttttttga 180 
gaaactgctg tancttntcg 240 

257 



<210> 81 

<211> 272 

<212> DNA 

<213> Glycine max 
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<400> 81 

catacctttt gttggcacca ttattagagc aatgcaggtc atatatgtta acagattctt 60 
accatcatca aggaagcagg ctgttaggga aataaaggaa ctgaataaca gagaagggcc 120 
tcttgtgata aatttcctcg agtactatta tttcccgagg gaacaacaac taatggcagg 180 
aaccttatct ccttccaact tggtgcattt atccctggat acccaatcca gcctgtaatt 240 
atacgctatc ctcatgtaca ctttgaccaa tc , 272 

<210> 82 

<211> 245 

<212> DNA 

<213> Glycine max 

<400> 82 

gggcatttca catactagag ttcatcccag tgaaaagaaa gtgggaggct gatgaatcaa 60 
tcatgcgcca tatgctttct acattcaagg atccacaaga tcctctctgg cttgcgcttt 120 
tcccagaagg cactgatttc actgagcaaa agtgccttcg gagtcaaaaa tatgctgctg 180 
aacataagtt accggttctg aaaaatgttt tacttccaag gacaaagggg cttctgtgcc 240 
gcttg ~ 245 

<210> 83 

<211> 268 

<212> DNA 

<213> Glycine max 

<400> 83 

cagtgtcctt cctttctgga caatgttttt ggtgttgac.c cttcagaagt gcacctgcat 60 

gtgcggcgta ttccggtgga ggagattcca gcttctgaaa ccaaagctgc ttcttggtta 120 

atcgacacat tccagatcaa ggaccaattg ctttcggatt tcaagattca aggccatttc 180 

cctaaccaac taaatgaaaa tgaaatttct agatttaaga gcctactctc ttttatggtg 240 

atagtttctt ttactgccat gtttattt 268 

<210> 84 

<211> 265 

<212> DNA 

<213> Glycine max 

<400> 84 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 

atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcagacaaag 120 

cttttacaag ctcaagagtt tgctgcttca aaagggctgc ctatacctag aaatgttttg 180 

attcctcgta ctaagggttt tgtcacagca gnacaaagcc ttcggccatt tcgttccagc 240 
catttatgat tgcacatatg cagtt 265 

<210> 85 

<211> 265 

<212> DNA 

<213> Glycine max 

<400> 85 

gaaagagact gggcaaaaga tgaaacatca ctgaagtcag gttttaggca tctagagcac 60 
atgccattcc ctttctggtt ggcccttttt gttgaaggaa ctcgtttcac gcagacaaag 120 
cttttacaag ctcaagagtt tgctgcttca aaagggctgc ctatacctag aaatgttttg 180 
attcctcgta ctaagggttt tgtcacagca gnacaaagcc ttcggccatt tcgttccagc 240 
catttatgat tgcacatatg cagtt 265 

<210> 86 

<211> 301 

<212> DNA 

<213> Zea mays 



<400> 86 

ctcgtcgtca agggcacccc gccgccgccg 
gtctgcaacc accgcaccgt gctcgacccc 
gtcagctgcg tcacctacag catctccaag 
gtcgcgctgt cgcgggaggc gacaaggacg 
gcgacctggt catctgcccc gagggnaaca 

g 



cccaagaagg gccacccggg cgtcctcttc 60 
gtcgaggtgg ccgtggcgct gcgccgcaag 120 
ttctccgagc tcatctcgcc catcaaggcc 180 
ccgagaacat ccgccgcctg ctggaggagg 240 
actgccgcga gcccttcctg ctgcgttcag 3 00 

301 



<210> 87 
<211> 309 
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<400> 87 

cgctcatgcg 

gctcatgggc 

cccgggcgtc 

ggcgctgcgc 

ctcgcccatc 

gcctgctgg 



gtgtacatca 
atcaggctcg 
ctcttcgtct 
cgcaaggtca 
aaggccgtcg 



acctgccgct 
tcgtcaaggg 
gcaaccaccg 
gctgcgtcac 
cgctgtcggg 



gcccgagcgc 
caccccgccg 
caccgtgctc 
ctacagcatc 
gaggcgacaa 



atcgtctact 
ccgccgccca 
gaccccgtcg 
tccaagttct 
ggacgccgag 



acacctacaa 60 
agaagggcca 120 
aggtggccgt 180 
ccgagctcat 240 
aacatccgcc 3 00 
309 



<210> 88 
<211> 304 
<212> DNA 
<213> Zea mays 



<400> 88 

tggctgtgca 

agctgctgag 

gtcgcgctcg 

tacatcaacc 

aggctcgtcg 

ttcg 



ggaggcctac 
cccgctgatt 
tcaccttcct 
tgccgctgcc 
tcaagggcac 



ctggtgacgt 
cgtgcacgac 
ctggatgccg 
cgagcgcatc 
cccgccgccg 



caaggaagta 
ggccgcctcg 
ttcggcttcg 
gtctactaca 
ccgcccaaga 



cagcccggtg 
tgcagcgccc 
cgctggcgct 
cctacaagct 
agggccaccc 



cccaggaacc 60 
gacgccgctc 120 
catgcgcgtg 180 
catgggcatc 240 

gggcgtcctc 300 

304 



<210> 89 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 89 

ggttcatcca 

caaagatttn 

gagaatctgc 

gatatttata 

tttatgttcc 

atggacagca 



cttgtgttgc 
gggctacggt 
ctccaaatag 
cccttctaac 
ctattatagg 

gg 



tattngaccg 
gacaatctcc 
ctgtcctggt 
tctagggagg 
gtgggcaatg 



gtaccgtagg agagcacagc actancatcg 60 
atgttctaca atcttnaggt cgaaggaatg 120 
gtctatgttg ctaaccatca gagcttcttg 180 
tgcttcaaat ttataagcaa gaccagcatc 2 40 
tatctcttgg gtgtgattcc tctgcggcgt 3 00 

312 



<210> 90 
<211> 264 
<212> DNA 
<213> Zea mays 

<400> 90 _ n 
ggtgctgtat ctgaaagaat ccatcgtgct catcaacaga aaaatgcacc aatgatgcta 60 
ctcttcccct gagggcacaa ctacaaatgg ggattatctc cttccattca aaacaggtgc 120 
ttttcttgca aaggcaccag ttcaaccagt cattttgaga tatccttaca aaagatttaa 180 
tgcagcatgg gattccatgt caggggcacg tcatgtattt ctgctgctct gtcaatttgt 240 
aaattaccta gaggtggtcc gctt 264 

<210> 91 

<211> 212 

<212> DNA 

<213> Zea mays 

<400> 91 

aaatgtcttg gatgcatttt tgttcagcgg gagtcgaaaa caccagattt caaaggtgtt 60 

tcaggtgctg tatttgaaag aatccatcgt gctcatcaac agaaaaatgc accaatgatg 12 0 

ctactcttcc ctgagggcac aactacaaat ggggattatc tccttccatt caaaacaggt 180 

gcttttcttg caaaggcacc agttcaacca gt 212 

<210> 92 
<211> 267 
<212> DNA 
<213> Zea mays 

<400> 92 

gtctaaagaa atngaaaggc gtggggnaat tgtgtctaat catgtntctt atgtggatat 60 
tctttatcan atgtcagcct cttttcctag ttttgttgct aagagatcag tggntagatt 120 
gcctctagtt ggtctcataa gcaaatgtct tggatgcatt tttgttcagc gggagtnnaa 180 
aatncanatt tcaaaggtgt ttaaggtgtg gnatctgaaa gaatccatcg tgctcatcaa 240 
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cagaaaaatg caccaatgat gctactc 



267 



<210> 93 
<211> 152 
<212> DNA 
<213> Zea mays 

<400> 93 

ctacaaatgg ggattacctt cttccattta agactggagc ctttnttgca ggtgcaccag 60 
tgcagccagt cattttgaaa tacccttaca ggagatttag tccagcatgg gatfccaatgg 120 
atggagcacg tcatgtgtta ttgctgctct gt 152 

<210> 94 
<211> 274 
<212> DNA 
<213> Zea mays 

<400> 94 

aaaatataaa ttaatatggt cttaatccca ccatataaat aacgttctct ttctgcaggg 60 . 

caatttagtt ctttctaata ttgggctggc agagaagcgc gtgtaccatg cagcactgac 120 

tggtagtagt ctacctggcg ctagacatga gaaagatgat tgaaagacgt tgcgtcgctt 180 

tttctgtaac agacagccga ggaacactta aaaatgtaac tgtgtgcgtg tttttatacc 240 
tgtaatgtgg cagtttattt gtttgaggag gctg 274 



<210> 95 
<211> 295 
<212> DNA 
<213> Zea mays 



<400> 95 

aatagctatc 

ttttacaatg 

cttacctcct 

ggacatgata 

caaccgtcct 



aagtacaata 
cacttggtcc 
caatatctga 
gctgctagag 
agtcccaaac 



aaatatttgt tgatgccttt tggaacagta agaagcaatc 60 
ggctgatgac atcatgggct gttgtgtgtg atgtttggta 120 
gggagggaga gacggcaatt gcatttgctg agagagtaag 180 
ctggactaaa gaaggttcct tgggatggct atctgaaaca 240 
acactgaaga gaacaacgca tattgccgat ctgtc 295 



<210> 96 
<211> 273 
<212> DNA 
<213> Zea mays 

<400> 96 

gngccatctc accggcggcn ggcctgcggc cggcaaccgg aggcgatggc gagctngtct 60 

gtggtggcgg acatggagca ntaccgcccc aacctggagg actacctccc gcccgactcg 120 

ctcccgcagg aggcgcccag gaatctccat ctgcgcgatc tgcttgacat ctcgccggtg 180 

ctaaccgagg cagcgggtgc catagtcgat gattcattca cccgttgctt taagtcgaat 240 
tctccagaac catggaatgg aacatatatt tgt 273 

<210> 97 
<211> 127 
<212> DNA 
<213> Zea mays 

<400> 97 

ctcaatatct ganggaggga gagactgcaa ttgcgtttgc tgagagagta agggacatga 60 
tagcagctag agctggtctt aagaaggtcc cgtgggatgg ctatctgaag cacaaccgcc 120 
ctagtcc 127 

<210> 98 
<211> 286 
<212> DNA 
<213> Zea mays 



<400> 98 

gaaccgtacg 

nctcggcggc 

gcgagctcgt 

ccgcccgant 

atctcgccgg 



cgcctcatta cgcccatcca cgtgctcgcc tctccccatc gcataatttt 60 
gtcgccatct ccancggcng cnggcctgcn gccggcaacc ggaggcgatg 120 
ctgtggcggc ggacatggag ctggaccgcc ccaacctgga ggactacntc 180 
cgctcccgca ggaggcgacc aggaatctcc atctgngcga tctgcttgan 240 
tgctaaccga ggcagcgggt gccatagtcg atgatt 286 
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<210> 99 
<211> 308 
<212> DNA 
<213> Zea mays 



<400> 99 

cgccatctca tcggcggcgg gcgtgcggcc 
tcgtctgtgg cgccggacat ggagctggac 
gactcgnncc cgcagaggcg ccccggaatc 
cggtgctcac cgaggcagcg ggtgccattg 
caaattctcc agagccatgg aattggaaca 
ataataag 

<210> 100 
<211> 282 
<212> DNA 
<213> Zea mays 



ggcggcngag gcgaggngcg attggcgagc 60 
cgcccanacc tggaggacta nctcccgccc 120 
tccanctgcg cgatctgctg gacatcncgc 180 
tcgatgactc cttcacacgg ngctttaagt 240 
tatatctgtt ccccttatgt gctttggtgt 300 

308 



<400> 100 

cagaaactag angttagtca cagcatggca 
gagcaactat gcaatttaat gccatgctgt 
ctgtttggct actaggaaga ccgaggtaga 
canccaaatg acagagtaaa tgaaggtagg 
gttgttaaca caagttcctc tgggaaaatc 

<210> 101 
<211> 282 
<212> DNA 
<213> Zea mays 



ttaaattgtc atagtaaaca acancncact 60 
gactaacttc tagtttctgg cattaaatta 120 
gaagcaaata taagaatacc ctccaacgca 180 
gttcaccttc ttgaacatga ccgtatactg 240 
agagagggtt tt 2 82 



<400> 101 

ggcgcggctg gccgtggcgc tggtcctgcc 
acnggcatgt cgtggcggct caaagggtng 
gggcgctgnc agctgttcgt gtgcaacnac 
gtagcgtgga ccgggaaatg cgcgncgtgt 
tctcccccat ngncggaang tgcacctgan 

<210> 102 
<211> 290 
<212> DNA 
<213> Zea mays 



gtacagtact cgacgccgat cctggcngcg 60 
cgcccngngc ttgcnnngcc gtgctccggc 120 
cggacgctga tcgacccngt gtacgtgtcc 180 
nctacagnct gangcggntn tcggagctca 240 
accgggaacg gg 282 



<400> 102 

ggacgcggca ccatgcgcgc cgagctggcc 
accacgtgcc gggagccctt cctgctccgc 
aggatcgtgc ccgtggcgat gaactaccgc 
gggtggaaag ccatggaccc catcttcttc 
cgttcctgaa ccantccccg caaagcgacg 



agtggcgacg tggccgtgtg ccccgagggc 60 
ttctccaagc tcttcgcgga gctcagcgac 120 
gtggggctct tccacccgac gacggcgcgc 180 
ttcatgaacn gcggcccgtg tacgaggtga 240 
tgcgcggcgg ggaagagccc 290 



<210> 103 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 103 

acgaggtgac gttcctgaac cagctccccg cagaggcgac gtgcgcggcg gggaagagcc 60 
ccgttgatgt agccaactac gttcagcgga tactcgctgc cacgctcggg ttcgagtgca 120 
ccaccctcac aaggaaggac aaatacacgg tgctcgccgg caacgacggc gtcctgaacg 180 
ccaagccggc ggcggcccgg aagccggctt ggcagagccg cgtgaaggaa gtcctcgggt 240 
tctgctccac taacaattac accttgccca gatctggac 279 



<210> 104 
<211> 315 
<212> DNA 
<213> Zea mays 



<400> 104 

gcccgagcgc atcgtctact acacctacaa 
caccccgccg ccgccgccca agaagggcca 
caccgtgctc gaccccgtcg aggtggccgt 



gctcatgggc atcaggctcg tcgtcaaggg 60 
cccgggcgtc ctcttcgtct gcaaccaccg 120 
ggcgctgcgc cgcaangtca gctgcgtcac 180 
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tacagcatct ccaagttctc cgagctcatc tcgcccatca aggccgtagc agnaaagcag 240 
gtcgcaaatg gagcagnagc gagtcgatgg aagngaattg gcgactggtc atctgcncga 3 00 
aggnacactg cggag 315 

<210> 105 
<211> 314 
<212> DNA 
<213> Zea mays 



<400> 105 

cgagacaccg 

aggagtgctg 

tgctcatctc 

tcctccgcgc 

tctccgagtc 

gcanatcgag 



agcacgtact 
ctcggagggg 
caggagcgcg 
cgcgctgctg 
gctggccatc 
atgg. 



accagcaaga 
cggtcggagc 
ttcccctact 
ctcctgtccg 
agcacgctgg 



tggtggcgtc 
agacggtggc 
acctcctcgt 
tgccgttcgt 
tgtacatctc 



tcccagattc 
cgccgacctg 
ggctctcgag 
ctacgtcacc 
cgtggcgggg 



aagcccatcg 6 0 
gacggcacgc 120 
gccggcagcg 180 
tacgccttct 240 
ctcaaggtgc 3 00 
314 



<210> 106 
<211> 291 
<212> DNA 
<213> Zea mays 



<400> 106 

ctctgggtct 

gattcaagcc 

acctggacgg 

tcgaggccgg 

tcacctacgc 



ggggccgaga 
catcgaggag 
cacgctgctc 
cagcgtcctc 
cttcttctcc 



caccgagcac 
tgctgctcgg 
atntccagga 
cgcgccgcgc 
gagtcgctgg 



gtactaccag caagatggtg gcgtctccca 60 
aggggcggtc ggagcagacg gtggccgccg 120 
gcgcgttccc ctactacctc ctcgtggctc 180 
tgctgctcct gtccgtgccg ttcgtctacg 240 
ccatcagcac gctggtgtac a 291 



<210> 107 
<211> 300 
<212> DNA 
<213> Zea mays 



<400> 107 

gcacgcagca 

ccagcaagat 

ggtcggagca 

tcccctacta 

tcctgtccgt 



gtacgacgtc 
ggtggcgtct 
gacggtggcc 
cctcctcgtg 
gccgttcgtc 



tctcctctgg 
cccagattca 
gcccjacctgg 
gctctcgagg 
tacgtcacct 



gtctggggcc 
agcccatcga 
acggcacgct 
ccggcagcgt 
acgccttctt 



gagacaccga 
ggagtgctgc 
gctcatctcc 
cctccgcgcc 
ctccgagtcg 



gcacgtacta 60 
tcggaggggc 120 
aggagcgcgt 180 
gcgctgctgc 240 
ctggccatca 3 00 



<210> 108 
<211> 284 
<212> DNA 
<213> Zea mays 

<400> 108 

gnggccgaga caccgagcac gtactaccag cangatggtg gcgtctccca gattcangcc 60 

antcgaggag tgctgctcgg aggggcggtc ggagcagacg gtggccgccg acctggacgg 120 

cacgctgctc atctccagga gcgcgttccc ctacnacctc ctcgtggctc tcgaggccgg 180 

cagcgtcctc cgcgccgcgc tgctgctcct gtccgtgccg ttcgtctacg tcactacgcc 240 

ttcttctccg agtcgctggc catcaanacg ctggtgtaca tctc 284 

<210> 109 
<211> 280 
<212> DNA 
<213> Zea mays 

<400> 109 

ctcctctggg tctggggccg agacaccgag cacgtactac cagcaagatg gtggcgtctc 60 
ccagattcaa gcccatcgag gagtgctgct cggaggggcg gtcggagcag acggtggccg 120 
ccgacctgga cggcacgctg ctcatctcca ggagcgcgtt ccnctactac ctcctcgtgg 180 
ctctcgaggc cggcagcgtc ctccgcgccg cgctgctgct cctgtccgtn ccgttcgtct 240 
acgtcaccta cgcnttnttc tccgagtcgc tggccatcag 2 80 

<210> 110 

<211> 287 

<212> DNA 

<213> Zea mays 
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<400> 110 

cgtctctcct ctgggtctgg ggccgagaca 
gtctcccaga ttcaagccca tcgaggagtg 
ggccgccgac ctggacggca gctgctcatc 
gtggctctcg aggccggcag cgtcctccgc 
gtctacgtca ctacggcttc ttctccgagt 

<210> 111 
<211> 286 
<212> DNA 
<213> Zea mays 



ccgagcacgt actaccagca agatggtggc 60 
ctgctcggag gggcggtcgg agcagacggt 120 
tccaggagcg cgttccccta ctacctcctc 180 
gccgcgctgc tgctcctgtc cgtgccgttc 240 
cgctggccat cagcacg .287 



<400> 111 

cgcacagtta cgacgtctct cctctgggtc 
gcaagatggt ggcgtctccc agattcaagc 
cggagcagac ggtggccgcc gacctggacg 
cctactactc ctcgtgctct cgaggccggc 
gtgcgttcgt ctagtcacta cgcttttctc 

<210> 112 
<211> 323 
<212> DNA 
<213> Zea mays 



tggggccgag acaccgagca cgtactacca 60 
ccatcgagga gtgctgctcg gaggggcggt 120 
gcacgctgct catctccagg agcgcgttcc 180 
aggtcctccg cgccgcgctg tgctcctgtc 240 
gancgtggca ataana 2 86 



<400> 112 

gttattccct gaaggtacca caacaaatgg 
attcatacct ggctaccctg ttcaacctgt 
tcaatcatgg gggnatatat cgttattaaa 
taatttcatg gaggtagagt accttcctgt 
tgcccttcat tttgcggagg ataccagcta 
aacttcctat tcatatggtg att 



gagattcctg atttcgttcc aacatggtgc 60 
tgttgtccgt tatccacatg tgcactttga 120 
gctcatgttt aagatgttca cccaatttca 180 
tgtctaccct cctgagatca agcaagagaa 240 
tgctatggca cgtgccctca atgtcttgcc 300 

323 



<210> 113 
<211> 312 
<212> DNA 
<213> Zea mays 



<400> 113 

cgataaggcc cttttcgaag agcttctacc 
tgtggcttca gcttgtctgg gtggtggact 
cagatgagga aacttacaga tcaatgggta 
ggagtgatat tgattggctc attggatgga 
gtacacttgc tgtcatgaag aagtcatcca 
ggtttgcaga gt 

<210> 114 
<211> 279 
<212> DNA 
<213> Zea mays 



gtcggatcaa cagattcttg gccgagctgc 60 
ggtgggcagg tgttaaggta caactgcatg 120 
aagagcatgc actcatcata tcaaatcatc 180 
tattggccca gcgttcaggg tgccttggaa 240 
agttccttcc agttattggc tggtcaatgt 300 

312 



<400> 114 

agtggggtct ccaaaggttg aaagacttcc 
agggtactcg ctttactcca gcaaagcttc 
gcttaccagc tcctagaaat gtacttattc 
gtattatgcg agattttgtt ccagccattt 
cccctcaacc aacaatgctg cggattttga 



ctagaccatt ttggctagct ctttttgttg 60 
tcgcagctca ggagtatgcg gcttcccagg 120 
cacgtaccaa gggatttgta tctgccgtaa 180 
acgatacaac tgtaatagtt cctaaagatt 240 
aagggcaat 279 



<210> 115 
<211> 304 
<212> DNA 
<213> Zea mays 



<400> 115 

cgtcaacgcc atccaggccg tcctatttgt 
ccgtcggatc aacagattct tggccgagct 
ctggtgggca ggtgttaagg tacaactgca 
taaagagcat gcactcatca tatcaaatca 
atattggccc agcgttcagg gtgccttgga 
agtt 



gacgataagg cccttttcga agagcttcta 60 
gctgtggctt cagcttgtct gggtggtgga 120 
tgcagatgag gaaacttaca gatcaatggg 180 
tcggagtgat attgattggc tcatggatgg 240 
agtacattgc tgtcatgaag aagtcatcca 3 00 

3 04 
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<210> 116 
<211> 259 
<212> DNA 
<213> Zea mays 

cttcctcctg tccggcctca tcgtcaacgc catccaggcc gtcctatttg tgacgataag 60 
gcccntttcg aagagcttct aacgtcggat caacagattc ntggccgagc tgctgtggct 120 
tcagcttgtc tgggtggtgg acnggtgggc aggtgttaag gtacaactgc atgcngatga 180 
ggaaacttac agatcnatgg gtanagagca tgcactcatc atatcaaatc atcggagtga 240 
tattgattgg cncattgga 

<210> 117 
<211> 235 
<212> DNA 
<213> Zea mays 

attccacgta ccaagggatt tgtatctgct gtaagtatta tgcgagattt tgttccagcc 60 
atttatgata caactgtaat agttcctaaa gattcccctc aaccaacaat gctgcggatt 120 
ttgaaagggc aatcatcagt gatacatgtc cgcatgaaac gtcatgcaat gagtgagatg 180 
ccaaaatcag atgaggatgt ttcaaaatgg tgtaaagaca tttttgtggc aaagg 23b 

<210> 118 
<211> 282 
<212> DNA 
<213> Zea mays 



<400> 118 

tgagatgcca 

ggatgcctta 

cggccgccca 

tgccatcgag 

tgccgcagga 



aaatcagatg atgacgtttc aaaatggtgt aaagacattt ttgtgacaaa 60 
ctggacaaac atttggcaac aggcactttc gatgaggaga ttagacctat 120 
gtgaaatcat tgctggtgac cctgttttgg tcgtgcctgc tgttgtttgg 180 
ttcttcaagt ggacgcagct cctatcgaca tggagaggag tggcattcac 240 
tggcgctcgt gacaggggtc atgcacgtct tc 282 



<210> 119 
<211> 166 

<212> DNA . ■ 

<213> Zea mays 

ctggtgggca ggcgttaagg tacaactaca tgcggatgag gacacttacc gatcaatggg 60 
taaagagcat gcactcgtca tatcaaatca tcgaagtgat attgattggc ttattggatg 120 
gatattggcc cagcgctcag ggtgccttgg aagtacgctc gctgtc lob 

<210> 120. 
<211> 234 
<212> DNA 
<213> Zea mays 

<400> 120 CA 
agtcanccaa gntccttcca gtcattggct ggtcaatgtg gtttgcagag tacctctttt 60 
nggagaggag ctgggccaag gatgaaaaga cactaaagtg gggtctccaa aggttgaaag 12 0 
acttccctag accatttngg ctagctcttn tttgtngagg gnantcgctt tactccagca 180 
angnttntng aggnnncagn agnnncgggn ttcccanggg ttaacagncc cana 23 4 

<210> 121 
<211> 210 
<212> DNA 
<213> Zea mays 

<400> 121 £n 
gtgagatgcn aaaatcagat gatgacgttt caaaatggtg taaagacatt tttgtggaca 60 
aaggatgcct tactggacaa acatttggca acaggcactt tcgatgagga gattagacct 120 
atcggccgcc cagtgaaatc atngctggtg accctgtnnt ggtcgtgcct gctgttgttt 180 
ggtgccatcg agntcttcaa gtggacgcag 

<210> 122 
<211> 274 
<212> DNA 
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<213> Zea mays 
<400> 122 

acncccgaat ccgccgcgcg 
cacagcagcc tatcgccgga 
tctgacccct ccgagatcgn 
cccgctcggc ctcctcttcc 
atttgtgaca ataaggccct 



cgcnccgtcc tcgtcgccgg 
gaaggaacgc cgcggggagc 
aagcggcggc catggcgatc 
tcctgtccgg cctcatcgtc 
tttccaagag cttg. 



cggaggcgcc cgcnaccgcc 60 
ttttccacng ccatctcccg 120 
ccgctcgtgc tcgtcgtgct 180 
aacaccatcc aggccatcct 240 

274 



<210> 123 
<211> 305 
<212> DNA 
<213> Zea mays 



<400> 123 

ttgcactgag 

agttgcctat 

gggagattga 

ctatctggtt 

gtcaagagta 

caagg 



gaaaggccat tagggatata tcaagtacat acataagagc 
ttttagctgg gcatttcaca tttttgagtt tatcccggta 
tgaagcaatt attcagaaca agctatcaaa atttaagaac 
ggcggttttt cctgaaggca cggattatac tgagaagaaa 
tgcttcagaa catggcttgc ctatgctaga acatgtcctc 



agcttgatga 60 
gaacggaaat 120 
ccgagagatc 180 
tgcatcatga 240 
cttccaaaga 300 
305 



<210> 124 
<211> 279 
<212> DNA 
<213> Zea mays 



<400> 124 

ccagattttc tggacaatgt gtatggcgtt 
atggttcagc tccatcacat ccccacaaca 
aggtttaggc agaaggacca gctcctggca 
aaaggaactg aaaggagatc tgtcgacgcc 
tatgcttgac ggccnatctg gtttgtacct 

<210> 125 
<211> 219 
<212> DNA 
<213> Zea mays 



gatccttctg aagtccacat ccacgtcaga 60 
gaagacaaga taacagaatg gatggncgag 120 
gatttcttca tgaaggggca tttcctgatg 180 
gagtgcctgg caaactttct taaccagtag 240 
aaactcttt 279 



<400> 125 

agattttntg gacaatgtgt atggngttga 
ggttcagctc catcacatcc ccacaacagn 
gtttaggcag aaggaccagc tcctggcaga 
aggaactgaa ggagatctgt cgacgccgaa 



tccttntgaa gtncacatcc acgtnagaat 60 
agacaagata acagaangga tggtagagag 12 0 
tttcttcatg aaggggcact ttcctgatga 180 
gtgcctggc 219 



<210> 126 
<211> 293 
<212> DNA 
<213> Zea mays 



<400> 126 

taccatagat gctgtgtacg acatcacgat 
ngacaacgtc tacngcgtgg ntccttcgga 
ctccgacata ncggcgtccg aaaaacgggg 
gcntnganna acgagctngc tgttcggggc 
cgaacgaaag ggaaaaaggg gaaccgaagg 



cgcntacaaa caccggcngc ngacatttct 60 
agtccacatc cacatcanca gcatccaggt 12 0 
tggctggcng gntnngtgga gcggttcaag 180 
tttctaccgc ggctggggcc aatttcnccc 240 
ggggaacctg ttngaacggg ncc 29 3 



<210> 127 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 127 

Val Xaa Asn His Xaa Ser 
1 5 



<210> 128 
<211> 6 
<212> PRT 
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<213> conserved sequence 
<400> 128 

Val Thr Tyr Ser Xaa Ser 
1 5 

<210> 129 . 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 129 

Val Xaa Leu Thr Arg Xaa Arg 
1 5 

<210> 130 
<211> 5 
<212>.PRT 

<213> conserved sequence 
<400> 130 

Cys Pro Glu Gly Thr 
15 



<210> 131 
<211> 5 
<212> PRT 

<213> conserved sequence 
<400> 131 

lie Val Pro Val Ala 
1 5 



<210> 132 
<211> 7 
<212> PRT 

<213> conserved sequence 
<400> 132 

Leu Xaa Xaa Gly Asp Leu Val 
1 5 



<210> 133 
<211> 6 
<212> PRT 

<213> conserved sequence 
<400> 133 

Phe Xaa Xaa Gly Ala Phe 
1 5 



<210> 134 
<211>. 6 
<212> PRT 

<213> Synthetic Oligonucleotide 
<400> 134 

Val Ala Asn Xaa Xaa Gin 
1 5 



<210> 135 
<211> 30 
<212> DNA 



* t 
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<213> Synthetic Oligonucleotide 
<400> 135 

ccatccgctt caagggaacg acacccatca 30 

<210> 136 
<211> 31 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 136 

tccctgtctt gcttgatgaa cttaaagctt g 31 

<210> 137 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 137 

acagcaggag tgtctgatga tggcagattc 3 0 

<210> 138 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 138 

actggagttc cagccaaaaa tgcacctgtc 30 

<210> 139 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 139 

gatacaccct tgaaatcagg cgattttgct 30 

<210> 140 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 140 

ttgcaaattc aattcctgtt tcaccgggcc 30 

<210> 141 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 141 

gttttctgct attccagaag gcgtcaacaa 30 

<210> 142 
<211> 32 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 142 

cattgaagat ccgtccgtga agttncctta cc 32 

<210> 143 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 143 

tcgagctgtg atcgatgatt ggctgtgaag 3 0 



<210> 144 
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<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 144 

gtctcttcaa aaacacacac acacgtctct 

<210> 145 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 145 

gtctcttcaa aaacacacac acacgtctct 

<210> 146 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 146 

gtagagagcc ttacttgctt cggtttagtc 

<210> 147 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 147 

acgtcatcgt acctgttgct attgactcac 

<210> 148 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 148 

acttttccat tgtcagggac tcctcgacac 

<210> 149 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 149 

acggtgtagg aagggaaagg attcaaaagg 

<210> 150 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 150 

gcgatgaact acagagtcgg attcttcctc 

<210> 151 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 151 

ccggtttacg agattacgtt cttgaaccag 

<210> 152 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 152 

caatggagac aaggctcgaa agtgctaacc 
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<210> 153 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 153 

attctctgaa catagttcgc cacggtcatg 

<210> 154 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 154 

gaaatccaac gccttcccaa tatcactctg 

<210> 155 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 155 

cttcaacttt ccatcaggat cttggcacgt 

<210> 156 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 156 

accacttgtt agagacctta cctgcttagg 

<210> 157 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 157 

tcctacctac accatccaat ttctcgaccc 

<210> 158 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 158 

ctgcgtcaag tgagcaactc agttcttgca 

<210> 159 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 159 

tgggaagcag cacgttgttc agtatcggaa 

<210> 160 
<211> 30 
<212> DNA 

<213> Synthetic Oligonucleotide 
<400> 160 

tagcctctgt gtaatctgtg ccctcgggga 

<210> 161 
<211> 1702 
<212> DNA 

<213> Simmondsia chinensis 
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<400> 161 

gaattctagc 

ctctaaaacc 

tagtataatt 

cggctgtgat 

ttcaggcaat 

acagggtgct 

gtgttaagat 

cacttgtgat 

agagatcagg 

cggtcatagg 

aggatgaaag 

ggttggctct 

aatatgctac 

gatttgtttc 

tggccatccc 

ccacggttca 

atgttgcaca 

1020 

atgtagatga 
1080 

tctttgtagc 
1140 

ggtcgtccct 
1200 

tcaccattct 
1260 

aggtagcccc 
1320 

agcagcacta 
1380 

attcaactgt 
1440 

aagagcctaa 
1500 

tatcagaatt 
1560 

atagtatctt 
1620 

tgagcattgt 
1680 

aattcgattc 
1702 



ctctctcctc 
ttaaaattgg 
atatctgggt 
tgtaccgctt 
ttgttttgtg 
ggtggaattg 
caagttgttc 
atcaaaccac 
ctgcctggga 
ttggtctatg 
cacattgaag 
tttcgtagaa 
t tcaatggga 
agccgtgagc 
taaatcttct 
tgtacacatc 
atggtgtcga 



ctgcaattct 
aatggaatcg 
aatcttgaat 
ggcttgctct 
ctcgtgcggc 
ttgtggcttg 
acagatcctg 
agaagtgata 
agcacactgg 
tggttttctg 
ttaggtcttc 
ggaacacgat 
ttgccagttc 
catatgcgtt 
tcgcagccta 
aagcgccgct 
gacacattcg 



cactttcgga gatgagtatc 
agtctcttgg gcattgattc 
tctatcatca tggaaggggg 
ttaatccaat 



tatgcagatc 
aggaaagccc 
aaagtatata 
tcagaatgtc 
tgaacctaca 
cgtgattccg 
aaatttcttt 
ttgggtttat 



aagaacatgg 
tggaccccaa 
aaatatagtt 
tacttggatc 
ggaccgatcc 
aatgatgtac 
atcgtggtaa 



acttgctttc 
tttaaaaata 
ttgttggtga 
tcttcttctc 
cactgtcaaa 
agctgatatg 
atacctttcg 
ttgattggct 
ctgtcatgaa 
agtacctttt 
aacgcctcaa 
ttacccaagc 
ctagaaatac 
cgtttgtccc 
caatgctcag 
cgatgaaaga 
tcgcaaagga 

tgcaggacac 

tcatcctggg 

tcgccttctc 

tttctcaatc 

tatcagaacc 

ctaagaagat 

tgagaaacaa 

tgtcgtcgcc , 

cggatcttag 

cggaattata 

atccttgtat 



gagtgctctg aa 



tacgatcttt 
tgatcttttt 
ggccatgggg 
tggtctcttc 
gnntacatac 
gctcgtagat 
gctaatgggt 
tgttggatgg 
gaaatcatca 
tcttgagaga 
ggactaccct 
taaactttta 
tttgatccct 
ggccatatat 
acttttcaaa 
tctccctgaa 
tgcactcctg 

tggccggcct 

aggtttgaaa 

agccgcatgc 

cgagcgctcg 

cacggaaacg 

tcagacgcaa 

aagatcaaga 

accgtctgct 

ccttctatgc 

atgttagtta 

tgtttataag 



ccctctctct 
gtaattgaat 
atcccagctg 
atcaacttca 
agaaggatta 
tggtgggcaa 
aaagagcatg 
gtgttggccc 
aagtttctcc 
agctgggcca 
ctgcctttct 
gcagctcaag 
cgtactaagg 
gatgtaacgg 
ggccagccat 
gcagcagatg 
gacaagcata 

ttgaaatctc 

ttcctacgat 

cttgtgctcg 

actcctgcta 

caacgacata 

gccacagttg 

ttagctgatg 

gctagctcgt 

atggattatg 

attaggggga 

atttgaagaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



<210> 162 
<211> 387 
<212> PRT 

<213> Simmondsia chinensis 



<400> 162 
Met Gly lie Pro 
1 



Phe Phe Ser Gly 
20 



Leu Val Arg Pro 
35 



Val Glu Leu Leu 
50 

Ser Val Lys lie 
65 



Gly Lys Glu Kis 



Trp Leu Val Gly 
100 



Ala Ala Ala Val 
5 

Leu Phe lie Asn 



Leu Ser Lys Thr 
40 

Trp Leu Glu Leu 
55 

Lys Leu Phe Thr 
70 

Ala Leu Val lie 
85 

Trp Val Leu Ala 



lie Val Pro Leu 
10 

Phe lie Gin Ala 
25 

Tyr Arg Arg lie 



lie Trp Leu Val 
60 

Asp Pro Asp Thr 
75 

Ser Asn His Arg 
90 

Gin Arg Ser Gly 
105 



Gly Leu Leu Phe 
15 

lie Cys Phe Val 
30 

Asn Arg Val Leu 
45 

Asp Trp Trp Ala 



Phe Arg Leu Met 
80 

Ser Asp lie Asp 
95 

Cys Leu Gly Ser 
110 
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Thr Leu Ala Val Met Lys Lys Ser Ser Lys Phe Leu Pro Val lie Gly 
115 120 125 

Trp Ser Met Trp Phe Ser Glu Tyr Leu Phe Leu Glu Arg Ser Trp Ala 
130 135 . 140 

Lys Asp Glu Ser Thr Leu Lys Leu Gly Leu Gin Arg Leu Lys Asp Tyr 
145 150 155 160 

Pro Leu Pro Phe Trp Leu Ala Leu Phe Val Glu Gly Thr Arg Phe Thr 
165 170 175 

Gin Ala Lys Leu Leu Ala Ala Gin Glu Tyr Ala Thr Ser Met Gly Leu 
180 185 190 t 

Pro Val Pro Arg Asn Thr Leu lie Pro Arg Thr Lys Gly Phe Val Ser 
195 200 205 

Ala Val Ser His Met Arg Ser Phe Val Pro Ala lie Tyr Asp Val Thr 
210 215 220 

Val Ala He Pro Lys Ser Ser Ser Gin Pro Thr Met Leu Arg Leu Phe 
225 230 235 240 

Lys Gly Gin Pro Ser Thr Val His Val His He Lys Arg Arg Ser Met 
245 250 255 

Lys Asp Leu Pro Glu Ala Ala Asp Asp Val Ala Gin Trp Cys Arg Asp 
260 265 270 

Thr Phe Val Ala Lys Asp Ala Leu Leu Asp Lys His Asn Val Asp Asp 
275 280 285 

Thr Phe Gly Asp Glu Tyr Leu Gin Asp Thr Gly Arg Pro Leu Lys Ser 
290 " 295 300 

Leu Phe Val Ala Val Ser Trp Ala Leu He Leu He Leu Gly Gly Leu 
305 310 315 320 

Lys Phe Leu Arg Trp Ser Ser Leu Leu Ser Ser Trp Lys Gly Val Ala 
325 330 335 

Phe Ser Ala Ala Cys Leu Val Leu Val Thr He Leu Met Gin He Leu 
340 345 350 

He Gin Phe Ser Gin Ser Glu Arg Ser Thr Pro Ala Lys Val Ala Pro 
355 360 365 

Gly Lys Pro Lys Asn Met Val Ser Glu Pro Thr Glu Thr Gin Arg His 
370 375 380 

Lys Gin His 
385 



<210> 163 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 163 

aagcttgcat gcgtcgacac aatggttcat gcgaccaagt cag 



<210> 164 
<211> 35 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 164 • 

ggtaccgtcg actcacttct tggtgttgtt gatag . 

<210> 165 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 165 

ggatccgcgg ccgcacaatg acgagcttta ctacttccct teat 

<210> 166 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 166 

ggatcccctg caggttagag atccattgat tetgeaat 

<210> 167 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 167 

ggatccgcgg ccgcataatg gaatcagagc tcaaagat 

<210> 168 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 168 

ggatcccctg caggtcattc ttctttctga tggaaatc 

<210> 169 

<211> 41 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 169 

ggatccgcgg ccgcacaatg actcgttcac aagatgtttc a 
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<210> 170 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 170 

ggatcccctg caggtcactt ctcttccaat ctagccag 

<210> 171 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 171 

ggatccgcgg ccgcacaatg tccggtaata agatctcgac tcttca 

<210> 172 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 172 

ggatcccctg caggttattt tttcttgaca actccgttat taccgg 

<210> 173 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 173 

atatccgcgg ccgcacaatg gttatggagc aagctggaa 

<210> 174 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 174 

ggatcccctg caggtcaatg gagacaaggc tcgaaagt 

<210> 175 

<211> 42 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Ol i gonuc 1 eo t i de 

<400> 175 
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ggatccgcgg ccgcacaatg tccgccaaga tttcaatatt cc 

<210> 176 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 176 

ggatcccctg caggttaatt tttcttaact actccatt 



<210> 177 
<211> 42 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 177 

ggatccgcgg ccgcacaatg ggagctcagg agaaacggcg cc 



<210> 178 
<211> 38 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 178 

ggatcccctg caggtcacgt cttctccttc ttcaccgg 

<210> 179 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 179 

ggatccgcgg ccgcacaatg gcggatcctg atctgtcttc tcct 

<210> 1.80 
<211> 44 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 180 

ggatcccctg caggttatgt tggggccaag tcaggtgcaa agat 

<210> 181 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 181 

ggatccgcgg ccgcaaaatg gaaaaaaaga gtgtaccaaa ttct 44 

<210> 182 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 182 

ggatcccctg caggttattt gtttactaat ttgagggaat tttttg 46 

<210> 183 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 183 

tcgacctgca ggaagcttaa ggatggtgat tgctgc 3 6 

<210> 184 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 184 

ggatccgcgg ccgcttactt ctccttctcc g 31 

<210> 185 

<211> 39 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 185 

ggatccgcgg ccgcacaatg tcttttaggg atgtcctag 3 9 

<210> 186 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 186 

ggatcccctg caggtcaatc atccttaccc tttggtttac c 41 

<210> 187 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 



WO 00/18889 5Q PCT/US99/22231 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 187 

atgtctttta gggatgtcct agaaagagga gatgaatttt ctgtgcggta tttcacaccg 60 

<210> 188 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 188 

tcaatcatcc ttaccctttg gtttaccctc tggaggcaga agattgtact gagagtgcac 60 

<210> 189 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 189 

ggatccgcgg ccgcacaatg aagcattccc aaaaataccg tagg 44 

<210> 190 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 190 

ggatcccctg caggtcaatg attttttttc atcacaaata c 41 

<210> 191 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 191 

atgaagcatt cccaaaaata ccgtaggtat ggaatttatg ctgtgcggta tttcacaccg 60 

<210> 192 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 192 

tcaatgattt tttttcatca caaatacaag aataagaaaa agattgtact gagagtgcac 60 

<210> 193 

<211> 43 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 193 

ggatccgcgg ccgcacaatg ggttttgttg atttcttcga aac 43 

<210> 194 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 194 

ggatcccctg caggttattt ggtctcaatt ttaatatttt tttgc 45 

<210> 195 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 195 

atgggttttg ttgatttctt cgaaacatat atggtcggtt ctgtgcggta tttcacaccg 60 

<210> 196 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 196 

ttatttggtc tcaattttaa tatttttttg caaggactcg agattgtact gagagtgcac 60 

<210> 197 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 197 

ggatccgcgg ccgcacaatg gaaaagtaca ccaattggag agac 44 

<210> 198 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 198 

ggatcccctg caggctactt cctcttttta cgttgatcgc tg 42 

<210> 199 
<211> 60 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 199 

atggaaaagt acaccaattg gagagacaat ggtacgggaa ctgtgcggta tttcacaccg 60 

<210> 200 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 200 

ctacttcctc tttttacgtt gatcgctgat atattccttc agattgtact gagagtgcac 60 

<210> 201 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 201 

ggatccgcgg ccgcacaatg cctgcaccaa aactcacgga g 41 

<210> 202 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 202 

ggatcccctg caggctacgc atctccttct ttcccttc 38 

<210> 203 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 203 

atgcctgcac caaaactcac ggagaaatct gcctcttcca ctgtgcggta tttcacaccg 60 

<210> 204 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 204 

ctacgcatct ccttctttcc cttcttcttc ttcttcctct agattgtact gagagtgcac 60 
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<210> 205 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 205 

ggatccgcgg ccgcacaatg tctgctcccg ctgccgatca taacgc 46 

<210> 206 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 206 

ggatcccctg caggtcattc tttcttttcg tgttctcttt tctg 44 

<210> 207 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 207 

atgtctgctc ccgctgccga tcataacgct gccaaaccta ctgtgcggta tttcacaccg 60 

<210> 208 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 208 

tcattctttc ttttcgtgtt ctcttttctg tcttaccagc agattgtact gagagtgcac 60 

<210> 209 
<211> 49 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 209 

ggatccgcgg ccgcacaatg ctgcatcaaa aaatagctca taaagttcg 49 

<210> 210 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 210 
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ggatcccctg caggtcaaaa aataaaacaa taaagtttat aaactaacc 49 

<210> 211 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic . 
Oligonucleotide 

<400> 211 

atgctgcatc aaaaaatagc tcataaagtt cgaaaagtcg ctgtgcggta tttcacaccg 60 
<210> 212 

<211> 60 '. 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 212 

tcaaaaaata aaacaataaa gtttataaac taaccaaatt agattgtact gagagtgcac 60 

<210> 213 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 213 

ggatccgcgg ccgcacaatg agtgtgatag gtaggttctt g 41 

<210> 214 

<211> 41 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 214 

ggatcccctg caggttaatg catctttttt acagatgaac c . 41 

<210> 215 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 215 

atgagtgtga taggtaggtt cttgtattac ttgaggtccg ctgtgcggta tttcacaccg 60 

<210> 216 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 
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<400> 216 

ttaatgcatc ttttttacag atgaaccttc gttatgggta agattgtact gagagtgcac 60 

<210> 217 
<211> 381 
<212> PRT 

<213> Saccharomyces sp . 

<220> 

<400> 217 

Met Ser Phe Arg Asp Val Leu Glu Arg Gly Asp Glu Phe Leu Glu Ala 
1 .5 10 15 

Tyr Pro Arg Arg Ser Pro Leu Trp Arg Phe Leu Ser Tyr Ser Thr Ser 
20 25 30 

Leu Leu Thr Phe Gly Val Ser Lys Leu Leu Leu Phe Thr Cys Tyr Asn 
35 40 45 

Val Lys Leu Asn Gly Phe Glu Lys Leu Glu Thr Ala Leu Glu Arg Ser 
50 55 60 

Lys Arg Glu Asn Arg Gly Leu Met Thr Val Met Asn His Met Ser Met 
65 70 75 80 

Val Asp Asp Pro Leu Val Trp Ala Thr Leu Pro Tyr Lys Leu Phe Thr 
85 90 95 

Ser Leu Asp Asn lie Arg Trp Ser Leu Gly Ala His Asn lie Cys Phe 
100 105 110 

Gin Asn Lys Phe Leu Ala Asn Phe Phe Ser Leu Gly Gin Val Leu Ser 
115 120 125 

Thr Glu Arg Phe Gly Val Gly Pro Phe Gin Gly Ser He Asp Ala Ser 
130 135 ( 140 

He Arg Leu Leu Ser Pro Asp Asp Thr Leu Asp Leu Glu Trp Thr Pro 
145 150 ^ 155 160 

His Ser Glu Val Ser Ser Ser Leu Lys Lys Ala Tyr Ser Pro Pro He 
165 170 175 

He Arg Ser Lys Pro Ser Trp Val His Val Tyr Pro Glu Gly Phe Val 
180 185 190 

Leu Gin Leu Tyr Pro Pro Phe Glu Asn Ser Met Arg Tyr Phe Lys Trp 
195 200 205 

Gly He Thr Arg Met He Leu Glu Ala Thr Lys Pro Pro He Val Val 
210 215 220 

Pro He Phe Ala Thr Gly Phe Glu Lys He Ala Ser Glu Ala Val Thr 
225 230 235 240 

Asp Ser Met Phe Arg Gin He Leu Pro Arg Asn Phe Gly Ser Glu He 
245 250 255 

Asn Val Thr He Gly Asp Pro Leu Asn Asp Asp Leu He Asp Arg Tyr 
260 ~ 265 270 

Arg Lys Glu Trp Thr His Leu Val Glu Lys Tyr Tyr Asp Pro Lys Asn 
275 280 285 

Pro Asn Asp Leu Ser Asp Glu Leu Lys Tyr Gly Lys Glu Ala Gin Asp 
290 295 300 

Leu Arg Ser Arg Leu Ala Ala Glu Leu Arg Ala His Val Ala Glu He 
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305 310 315 320 

Arg Asn Glu Val Arg Lys Leu Pro Arg Glu Asp Pro Arg Phe Lys Ser 
325 330 335 

Pro Ser Trp Trp Lys Arg Phe Asn Thr Thr Glu Gly Lys Ser Asp Pro • 
340 345 350 

Asp Val Lys Val lie Gly Glu Asn Trp Ala lie Arg Arg Met Gin Lys 
355 360 365 

Phe Leu Pro Pro Glu Gly Lys Pro Lys Gly Lys Asp Asp 
370 375 380 

<210> 218 
<211> 396 
<212> PRT 

<213> Saccharomyces sp . 

<220> 

<400> 218 

Met Lys His Ser Gin Lys Tyr Arg Arg Tyr Gly lie Tyr Glu Lys Thr 
1 5 10 15 

Gly Asn Pro Phe lie Lys Gly Leu Gin Arg Leu Leu He Ala Cys Leu 
20 25 30 

Phe He Ser Gly Ser Leu Ser lie Val Val Phe Gin He Cys Leu Gin 
35 40 ' 45 

Val Leu Leu Pro Trp Ser Lys He Arg Phe Gin Asn Gly He Asn Gin 
50 .55 60 

Ser Lys Lys Ala Phe He Val Leu Leu Cys Met He Leu Asn Met Val 
65 70 75 80 

Ala Pro Ser Ser Leu Asn Val Thr Phe Glu Thr Ser Arg Pro Leu Lys 
85 90 95 

Asn Ser Ser Asn Ala Lys Pro Cys Phe Arg Phe Lys Asp Arg Ala He 
100 105 HO 

He He Ala Asn His Gin Met Tyr Ala Asp Trp He Tyr Leu Trp Trp 
115 120 125 

Leu Ser Phe Val Ser Asn Leu Gly Gly Asn Val Tyr He He Leu Lys 
130 135 140 

Lys Ala Leu Gin Tyr He Pro Leu Leu Gly Phe Gly Met Arg Asn Phe 
145 150 155 160 

Lys Phe He Phe Leu Ser Arg Asn Trp Gin Lys Asp Glu Lys Ala Leu 
165 170 175 

Thr Asn Ser Leu Val Ser Met Asp Leu Asn Ala Arg Cys Lys Gly Pro 
180 185 190 

Leu Thr Asn Tyr Lys Ser Cys Tyr Ser Lys Thr Asn Glu Ser He Ala 
195 ^ 200 205 

Ala Tyr Asn Leu He Met Phe Pro Glu Gly Thr Asn Leu Ser Leu Lys 
210 215 220 

Thr Arg Glu Lys Ser Glu Ala Phe Cys Gin Arg Ala His Leu Asp His 
225 230 235 240 

Val Gin Leu Arg His Leu Leu Leu Pro His Ser Lys Gly Leu Lys Phe 
245 250 255 
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Ala Val Glu Lys 
260 

lie Gly Tyr Ser 
275 

Thr Leu Lys Lys 
290 

Phe Tyr lie Arg 
305 

Glu Val Phe Phe 



Leu Leu Glu Asp 
340 

Asn Asp Asn Gin 
355 

His Glu Thr Leu 
370 

Phe Leu lie Leu 
385 



Leu Ala Pro Ser 



Pro Ala Leu Arg 
280 



lie Phe Leu Met 
295 

Glu Phe Arg Val 
310 



Asn Trp Leu Leu 
325 

Tyr Tyr Asn Thr 



Ser lie Val Val 
360 

Thr Pro Arg He 
375 

Val Phe Val Met 
390 



Leu Asp Ala He 
265 

Thr Glu Tyr Val 



Gly Val Tyr Pro 
300 

Asn Glu He Pro 
315 

Gly Val Trp Lys 
330 

Gly Gin Phe Lys 
345 

Thr Thr Gin Thr 



Leu Ser Tyr Tyr 
380 

Lys Lys Asn His 
395 



Tyr Asp Val Thr 
270 

Gly Thr Lys Phe 
285 

Glu Lys Val Asp 



Leu Gin Asp Asp 
320 

Glu Lys Asp Gin 
335 

Ser Asn Ala Lys 
350 

Thr Gly Phe Gin 
365 

Gly Phe Phe Ala 



<210> 219 
<211> 479 
<212> PRT 

<213> Saccharomyces sp . 
<220> 



<400> 219 

Met Gly Phe Val Asp Phe Phe Glu Thr Tyr Met Val Gly Ser Arg Val 
1 5 10 15 

Gin Phe Lys Gin Leu Asp He Ser Asp Trp Leu Ser Leu Thr Pro Arg 
20 25 30 

Leu Leu He Leu Phe Gly Tyr Phe Tyr Leu His Ser Phe Phe Thr Ala 
35 40 45 

He Asn Gin Phe Leu Gin Phe He Asn Thr Asn Ser Phe Cys Leu Arg 
50 55 60 

Leu His Leu Leu Tyr Asp Arg Phe Trp Ser His Val Pro He He Gly 
65 70 75 80 

Glu Tyr Lys He Arg Leu Leu Ser Arg Ala Leu Thr Tyr Ser Lys Leu 
85 90 95 

Lys He He Pro Thr Leu Asp Lys Val Leu Glu Ala He Glu He Trp 
100 105 110 

Phe Gin Leu His Leu Val Glu Met Thr Phe Glu Lys Lys Lys Asn Val 
115 120 125 

Gin He Phe lie Thr Glu Gly Ser Asp Asp Leu Asn Phe Phe Lys Asp 
130 135 140 

Ser Lys Phe Gin Thr Thr Leu Met He Cys Asn His Arg Ser Val Asn 
145 150 155 160 

Asp Tyr Thr Leu He Asn Tyr Leu Phe Leu Lys Ser Cys Pro Thr Lys 
165 170 175 
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Phe Tyr Thr Lys Trp Glu Phe Leu Gin Lys Leu Arg Lys Gly Glu Asp 
180 185 190 

Leu Ala Glu Trp Pro Gin Leu Lys Phe* Leu Gly Trp Gly Lys Met Phe 
195 200 205 

Asn Phe Pro Arg Leu Asp Leu Leu Lys Asn lie Phe Phe Lys Asp Glu 
210 215 220 

Thr Leu Ala Leu Ser Ser Asn Glu Leu Arg Asp lie Leu Glu Arg Gin 
225 230 235 240 

Asn Asn Gin Ala lie Thr lie Phe Pro Glu Val Asn He Met Ser Leu 
245 250 255 

Glu Leu Ser He lie Gin Arg Lys Leu His Gin Asp Phe Pro Phe Val 
260 265 270 

He Asn Phe Tyr Asn Leu Leu Tyr Pro Arg Phe Lys Asn Phe Thr Thr 
275 280 285 

Leu Met Ala Ala Phe Ser Ser He Lys Asn He Lys Arg Lys Lys Asn 
290 295 300 

Arg Asn Asn He He Lys Glu Ala Arg Tyr Leu Phe His Arg Glu Leu 
305 310 315 320 

Asp Lys Leu Val His Lys Ser Met Lys Met Glu Ser Ser Lys Val Ser 
325 330 335 

Asp Lys Thr Thr Pro Pro Met He Val Asp Asn Ser Tyr Leu Leu Thr 
340 345 350 

Lys Lys Glu Glu He Ser Ser Gly Lys Pro Lys Val Val Arg He Asn 
355 360 365 

Pro Tyr He Tyr Asp Val Thr He He Tyr Tyr Arg Val Lys Tyr Thr 
370 375 380 

Asp Ser Gly His Asp His Thr Asn Gly Asp Leu Arg Leu His Lys Gly 
385 390 395 400 

Tyr Gin Leu Glu Gin He Ser Pro Thr He Phe Glu Met lie Gin Pro 
405 410 415 

Glu Met Glu Ser Glu Asn Asn He Lys Asp Lys Asp Pro He Val Val 
420 425 430 

Met Val Asn Val Lys Lys His Gin He Gin Pro Leu Leu Ala Tyr Asn 
435 440 445 

Asp Glu Ser Leu Glu Lys Trp Leu Glii Asn Arg Trp lie Glu Lys Asp 
450 455 460 

Arg Leu He Glu Ser Leu Gin Lys Asn He Lys He Glu Thr Lys 
465 470 475 



<210> 220 
<211> 300 
<212> PRT 

<213> Saccharomyces sp . 
<400> 220 

Met Glu Lys Tyr Thr Asn Trp Arg Asp Asn Gly Thr Gly He Ala Pro 
1 5 10 15 

Phe Leu Pro Asn Thr He Arg Lys Pro Ser Lys Val Met Thr Ala Cys 
20 25 30 
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Leu Leu Gly lie. Leu Gly Val Lys Thr He He Met Leu Pro Leu He 
35 40 45 

Met Leu Tyr Leu Leu Thr Gly Gin Asn Asn Leu Leu Gly Leu He Leu 
50 55 60 

Lys Phe Thr Phe Ser Trp Lys Glu Glu lie Thr Val Gin Gly He Lys 
65 70 75 80 

Lys Arg Asp Val Arg Lys Ser Lys His Tyr Pro Gin Lys Gly Lys Leu 
85 90 95 

Tyr He Cys Asn CyS Thr Ser Pro Leu Asp Ala Phe Ser Val Val Leu 
100 105 110 

Leu Ala Gin Gly Pro Val Thr Leu Leu Val Pro Ser Asn Asp He Val 
115 120 125 

Tyr Lys Val Ser He Arg Glu Phe He Asn Phe He Leu Ala Gly Gly 
130 135 140 

Leu Asp He Lys Leu Tyr Gly His Glu Val Ala Glu Leu Ser Gin Leu 
145 ' ^ 150 155 160 

Gly Asn Thr Val Asn Phe Met Phe Ala Glu Gly Thr Ser Cys Asn Gly 
165 170 175 

Lys Ser Val Leu Pro Phe Ser lie Thr Gly Lys Lys Leu Lys Glu Phe 
180 185 ; 190 

He Asp Pro Ser He Thr Thr Met Asn Pro Ala Met Ala Lys Thr Lys 
195 200 205 

Lys Phe Glu Leu Gin Thr He Gin He Lys Thr Asn Lys Thr Ala He 
210 215 220 

Thr Thr Leu Pro He Ser Asn Met Glu Tyr Leu Ser Arg Phe Leu Asn 
225 230 235 240 

Lys Gly He Asn Val Lys Cys Lys He Asn Glu Pro Gin Val Leu Ser 
245 250 255 

Asp Asn Leu Glu Glu Leu Arg Val Ala Leu Asn Gly Gly Asp Lys Tyr 
260 265 270 

Lys Leu Val Ser Arg Lys Leu Asp Val Glu Ser Lys Arg Asn Phe Val 
275 280 285 

Lys Glu Tyr He Ser Asp Gin Arg Lys Lys Arg Lys 
290 295 300 



<210> 221 
<211> 759 
<212> PRT 

<213> Saccharomyces sp . 
<400> 221 

Met Pro Ala Pro Lys Leu Thr Glu Lys Phe Ala Ser Ser Lys Ser Thr 
15 10 15 

Gin Lys Thr Thr Asn Tyr Ser Ser He Glu Ala Lys Ser Val Lys Thr 
20 ' 25 30 

Ser Ala Asp Gin Ala Tyr He Tyr Gin Glu Pro Ser Ala Thr Lys Lys 
35 40 45 



He Leu Tyr Ser He Ala Thr Trp Leu Leu Tyr Asn He Phe His Cys 
50 55 60 
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Phe Phe Arg Glu lie Arg Gly Arg Gly Ser Phe Lys Val Pro Gin Gin 
65 70 75 80 

Gly Pro Val lie Phe Val Ala Ala Pro His Ala Asn Gin Phe Val Asp 
85 90 95 

Pro Val lie Leu Met Gly Glu Val Lys Lys Ser Val Asn. Arg Arg Val 
100 105 110 

Ser Phe Leu lie Ala Glu Ser Ser Leu Lys Gin Pro Pro lie Gly Phe 
115 120 125 

Leu Ala Ser Phe Phe Met Ala lie Gly Val Val Arg Pro Gin Asp Asn 
130 135 140 

Leu Lys Pro Ala Glu Gly Thr lie Arg Val Asp Pro Thr Asp Tyr Lys 
145 150 155 160 

Arg Val lie Gly His Asp Thr His Phe Leu . Thr Asp Cys Met Pro Lys 
165 170 175 

Gly Leu lie Gly Leu Pro Lys Ser Met Gly Phe Gly Glu lie Gin Ser 
180 185 190 

lie Glu Ser Asp Thr Ser Leu Thr Leu Arg Lys Glu Phe Lys Met Ala 
195 200 205 

Lys Pro Glu lie Lys Thr Ala Leu Leu Thr Gly Thr Thr Tyr Lys Tyr 
210 215 - 220 

Ala Ala Lys Val Asp Gin Ser Cys Val Tyr His Arg Val Phe Glu His 
225 230 235 240 

Leu Ala His Asn Asn Cys He Gly lie Phe Pro Glu Gly Gly Ser His 
245 250 255 

Asp Arg Thr Asn Leu Leu Pro Leu Lys Ala Gly Val Ala He Met Ala 
260 265 270 

Leu Gly Cys Met Asp Lys His Pro Asp Val Asn Val Lys He Val Pro 
275 280 285 

Cys Gly Met Asn Tyr Phe His Pro His Lys Phe Arg Ser Arg Ala Val 
290 295 300 

Val Glu Phe Gly Asp Pro He Glu He Pro Lys Glu Leu Val Ala Lys 
305 310 315 320 

Tyr His Asn Pro Glu Thr Asn Arg Asp Ala Val Lys Glu Leu Leu Asp 
325 330 335 

Thr He Ser Lys Gly Leu Gin Ser Val Thr Val Thr Cys Ser Asp Tyr 
340 345 350 

Glu Thr Leu Met Val Val Gin Thr He Arg Arg Leu Tyr Met Thr Gin 
355 360 365 

Phe Ser Thr Lys Leu Pro Leu Pro Leu He Val Glu Met Asn Arg Arg 
370 375 380 

Met Val Lys Gly Tyr Glu Phe Tyr Arg Asn Asp Pro Lys He Ala Asp 
385 390 395 400 

Leu Thr Lys Asp He Met Ala Tyr Asn Ala Ala Leu Arg His Tyr Asn 
405 410 415 

Leu Pro Asp His Leu Val Glu Glu Ala Lys Val Asn Phe Ala Lys Asn 
420 425 430 
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Leu Gly Leu Val Phe Phe Arg Ser lie Gly Leu Cys lie Leu Phe Ser 
435 440 445 

Leu Ala Met Pro Gly lie lie Met Phe Ser Pro Val Phe lie Leu Ala 
450 455 460 

Lys Arg lie Ser Gin Glu Lys Ala Arg Thr Ala Leu Ser Lys Ser Thr 
465 ~ 470 475 480 

Val Lys lie Lys Ala Asn Asp Val lie Ala Thr Trp Lys lie Leu lie 
485 490 495 

Gly Met Gly Phe Ala Pro Leu Leu Tyr lie Phe Trp Ser Val Leu He 
500 505 510 

Thr Tyr Tyr Leu Arg His Lys Pro Trp Asn Lys He Tyr Val Phe Ser 
515 520 525 

Gly Ser Tyr He Ser Cys Val He Val Thr Tyr Ser Ala Leu He Val 
530 535 540 

Gly Asp He Gly Met Asp Gly Phe Lys Ser Leu Arg Pro Leu Val Leu 
545 ~ 550 555 560 

Ser Leu Thr Ser Pro Lys Gly Leu Gin Lys Leu Gin Lys Asp Arg Arg 
565 570 575 

Asn Leu Ala Glu Arg He He Glu Val Val Asn Asn Phe Gly Ser Glu 
580 585 590 

Leu Phe Pro Asp Phe Asp Ser Ala Ala Leu Arg Glu Glu Phe Asp Val 
595 600 605 

lie Asp Glu Glu Glu Glu Asp Arg Lys Thr Ser Glu Leu Asn Arg Arg 
610 615 620 

Lys Met Leu Arg Lys Gin Lys He Lys Arg Gin Glu Lys Asp Ser Ser 
625 630 635 640 

Ser Pro He He Ser Gin Arg Asp Asn His Asp Ala Tyr Glu His His 
645 650 655 

Asn Gin Asp Ser Asp Gly Val Ser Leu Val Asn Ser Asp Asn Ser Leu 
660 665 670 

Ser Asn He Pro Leu Phe Ser Ser Thr Phe His Arg Lys Ser Glu Ser 
675 680 685 

Ser Leu Ala Ser Thr Ser Val Ala Pro Ser Ser Ser Ser Glu Phe Glu 
690 695 700 

Val Glu Asn Glu He Leu Glu Glu Lys Asn Gly Leu Ala Ser Lys He 
705 710 715 720 

Ala Gin Ala Val Leu Asn Lys Arg He Gly Glu Asn Thr Ala Arg Glu 
725 730 735 

Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu 
740 745 750 

Glu Gly Lys Glu Gly Asp Ala 
755 



<210> 222 
<211> 743 
<212> PRT 

<213> Saccharomyces sp . 



<400> 222 
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Met Ser Ala Pro Ala Ala Asp His Asn Ala Ala Lys Pro lie Pro His 
15 10 15 

Val Pro Gin Ala Ser Arg Arg Tyr Lys Asn Ser Tyr Asn Gly Phe Val 
20 25 30 

. Tyr Asn lie His Thr Trp Leu Tyr Asp Val Ser Val Phe Leu Phe Asn 
35 40 45 

He Leu Phe Thr He Phe Phe Arg Glu He Lys Val Arg Gly Ala Tyr 
50 55 60 

Asn Val Pro Glu Val Gly Val Pro Thr lie Leu Val Cys Ala Pro His 
65 70 75 80 

Ala Asn Gin Phe He Asp Pro Ala Leu Val Met Ser Gin Thr Arg Leu 
85 90 95 

Leu Lys Thr Ser Ala Gly Lys Ser Arg Ser Arg Met Pro Cys Phe Val 
100 105 110 

Thr Ala Glu Ser Ser Phe Lys Lys Arg Phe He Ser Phe Phe Gly His 
115 120 125 

_Ala Met Gly Gly He Pro Val Pro Arg He Gin Asp Asn Leu Lys Pro 
130 135 140 

Val Asp Glu Asn Leu Glu He Tyr Ala Pro Asp Leu Lys Asn His Pro 
145 150 155 160 

Glu He He Lys Gly Arg Ser Lys Asn Pro' Gin Thr Thr Pro Val Asn 
165 170 175 

Phe Thr Lys Arg Phe Ser Ala Lys Ser Leu Leu Gly Leu Pro Asp Tyr 
180 185 190 

Leu Ser Asn Ala Gin He Lys Glu He Pro Asp Asp Glu Thr He He 
195 200 205 

Leu Ser Ser Pro Phe Arg Thr Ser Lys Ser Lys Val Val Glu Leu Leu 
210 215 220 

Thr Asn Gly Thr Asn Phe Lys Tyr Ala Glu Lys He Asp Asn Thr Glu 
225 230 235 240 

Thr Phe Gin Ser Val Phe Asp His Leu His Thr Lys Gly Cys Val Gly 
245 250 255 

He Phe Pro Glu Gly Gly Ser His Asp Arg Pro Ser Leu Leu Pro He 
260 265 270 

Lys Ala Gly Val Ala He Met Ala Leu Gly Ala Val Ala Ala Asp Pro 
275 280 285 

Thr Met Lys Val Ala Val Val Pro Cys Gly Leu His Tyr Phe His Arg 
290 295 300 

Asn Lys Phe Arg Ser Arg Ala Val Leu Glu Tyr Gly Glu Pro He Val 
305 310 315 320 

Val Asp Gly Lys Tyr Gly Glu Met Tyr Lys Asp Ser Pro Arg Glu Thr 
325 330 335 

Val Ser Lys Leu Leu Lys Lys He Thr Asn Ser Leu Phe Ser Val Thr 
340 345 350 

Glu Asn Ala Pro Asp Tyr Asp Thr Leu Met Val lie Gin Ala Ala Arg 
355 360 365 

Arg Leu Tyr Gin Pro Val Lys Val Arg Leu Pro Leu Pro Ala He Val 
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370 375 380 

Glu lie Asn Arg Arg Leu Leu Phe Gly Tyr Ser Lys Phe Lys Asp Asp 
385 390 395 400 

Pro Arg lie lie His Leu Lys Lys Leu Val Tyr Asp Tyr Asn Arg Lys 
405 410 415 

Leu Asp Ser Val Gly Leu Lys Asp His Gin Val Met Gin Leu Lys Thr- 
420 425 430 

Thr Lys Leu Glu Ala Leu Arg Cys Phe Val Thr Leu lie Val Arg Leu 
435 440 445 

lie Lys Phe Ser Val Phe Ala lie Leu Ser Leu Pro Gly Ser l\e Leu 
450 455 460 

Phe Thr Pro lie Phe lie lie Cys Arg Val Tyr Ser Glu Lys Lys Ala 
465 470 475 480 

Lys Glu Gly Leu Lys Lys Ser Leu Val Lys lie Lys Gly Thr Asp Leu 
485 * 490 495 

Leu Ala Thr Trp Lys Leu lie Val Ala Leu lie Leu Ala Pro lie Leu 
500 505 510 

Tyr Val Thr Tyr Ser lie Leu Leu lie He Leu Ala Arg Lys Gin His 
515 520 525 

Tyr Cys Arg He Trp Val Pro Ser Asn Asn Ala Phe lie Gin Phe Val 
530 " 535 540 

Tyr Phe Tyr Ala Leu Leu Val Phe Thr Thr Tyr Ser Ser Leu Lys Thr 
545 550 555 560 

Gly Glu lie Gly Val Asp Leu Phe Lys Ser Leu Arg Pro Leu Phe Val 
565 570 575 

Ser He Val Tyr Pro Gly Lys Lys He Glu Glu He Gin Thr Thr Arg 
580 585 590 

Lys Asn Leu Ser Leu Glu Leu Thr Ala Val Cys Asn Asp Leu Gly Pro 
595 600 605 

Leu Val Phe Pro Asp Tyr Asp Lys Leu Ala Thr Glu He Phe Ser Lys 
610 615 620 

Arg Asp Gly Tyr Asp Val Ser Ser Asp Ala Glu Ser Ser He Ser Arg 
625 630 635 640 

Met Ser Val Gin Ser Arg Ser Arg Ser Ser Ser He His Ser He Gly 
645 650 655 

Ser Leu Ala Ser Asn Ala Leu Ser Arg Val Asn Ser Arg Gly Ser Leu 
660 665 670 

Thr Asp He Pro He Phe Ser Asp Ala Lys Gin Gly Gin Trp Lys Ser 
675 680 685 

Glu Gly Glu Thr Ser Glu Asp Glu Asp Glu Phe Asp Glu Lys Asn Pro 
690 695 700 

Ala He Val Gin Thr Ala Arg Ser Ser Asp Leu Asn Lys Glu Asn Ser 
705 710 715 720 

Arg Asn Thr Asn He Ser Ser Lys He Ala Ser Leu Val Arg Gin Lys 
725 730 735 

Arg Glu His Glu Lys Lys Glu 
740 
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<210> 223 
<211> 397 
<212> PRT 

<213> Saccharomyces sp. 
<400> 223 

Met Leu His Gin Lys He Ala His Lys Val Arg Lys Val Val Val Pro. 
1 5 10 .15 

Gly He Ser Leu Leu He Phe Phe Gin Gly Cys Leu He Leu Leu Phe 
20 25 30 

Leu Gin Leu Thr Tyr Lys Thr Leu Tyr Cys Arg Asn Asp He Arg Lys 
35 40 45 

Gin He Gly Leu Asn Lys Thr Lys Arg Leu Phe He Val Leu Val Ser 
50 55 60 

Ser He Leu His Val Val Ala Pro Ser Ala Val Arg He Thr Thr Glu 
65 70 75 80 

Asn Ser Ser Val Pro Lys Gly Thr Phe Phe Leu Asp Leu Lys Lys Lys 
85 90 95 

Arg He Leu Ser His Leu Lys Ser Asn Ser Val Ala He Cys Asn His 
100 105 HO 

Gin He Tyr Thr Asp Trp lie Phe Leu Trp Trp Leu Ala Tyr Thr Ser 
115 120 125 

Asn Leu Gly Ala Asn Val Phe He He Leu Lys Lys Ser Leu Ala Ser 
130 135 140 

He Pro He Leu Gly Phe Gly Met Arg Asn Tyr Asn Phe He Phe Met 
145 150 155 160 

Ser Arg Lys Trp Ala Gin Asp Lys He Thr Leu Ser Asn Ser Leu Ala 
165 170 175 

Gly Leu Asp Ser Asn Ala Arg Gly Ala Gly Ser Leu Ala Gly Lys Ser 
180 185 190 

Pro Glu Arg He Thr Glu Glu Gly Glu Ser He Trp Asn Pro Glu Val 
195 200 205 

He Asp Pro Lys Gin He His Trp Pro Tyr Asn Leu He Leu Phe Pro 
210 ^ 215 220 

Glu Gly Thr Asn Leu Ser Ala Asp Thr Arg Gin Lys Ser Ala Lys Tyr 
225 230 235 240 

Ala Ala Lys He Gly Lys Lys Pro Phe Lys Asn Val Leu Leu Pro His 
245 250 255 

Ser Thr Gly Leu Arg Tyr Ser Leu Gin Lys Leu Lys Pro Ser He Glu 
260 265 270 

Ser Leu Tyr Asp He Thr He Gly Tyr Ser Gly Val Lys Gin Glu Glu 
275 280 285 

Tyr Gly Glu Leu He Tyr Gly Leu Lys Ser He Phe Leu Glu Gly Lys 
290 295 300 

Tyr Pro Lys Leu Val Asp He His He Arg Ala Phe Asp Val Lys Asp 
305 " 310 315 320 

He Pro Leu Glu Asp Glu Asn Glu Phe Ser Glu Trp Leu Tyr Lys He 
325 330 335 
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Trp Ser Glu Lys Asp Ala Leu Met Glu Arg Tyr Tyr Ser Thr Gly Ser 
340 345 350 

Phe Val Ser Asp Pro Glu Thr Asn His Ser Val Thr Asp Ser Phe Lys 
355 360 ■ 365 

lie Asn Arg lie Glu Leu Thr Glu Val Leu lie Leu Pro Thr Leu Thr 
370 375 380 

lie lie Trp Leu Val Tyr Lys Leu Tyr Cys Phe lie Phe 
385 390 395 

<210> 224 
<211> 303 
<212> PRT 

<213> Saccharomyces sp. 
<400> 224 

Met Ser Val lie Gly Arg Phe Leu Tyr Tyr Leu Arg Ser Val Leu Val 
15 10 15 

Val Leu Ala Leu Ala Gly Cys Gly Phe Tyr Gly Val lie Ala Ser lie 
20 25 30 

Leu Cys Thr Leu lie Gly Lys Gin His Leu Ala Gin Trp lie Thr Ala 
35 40 45 

Arg Cys Phe Tyr His Val Met Lys Leu Met Leu Gly Leu Asp Val Lys 
50 55 60 

Val Val Gly Glu Glu Asn Leu Ala Lys Lys Pro Tyr lie Met lie Ala 
65 70 75 80 

Asn His Gin Ser Thr Leu Asp lie Phe Met Leu Gly Arg lie Phe Pro 
85 90 95 

Pro Gly Cys Thr Val Thr Ala Lys Lys Ser Leu Lys Tyr Val Pro Phe 
100 105 110 

Leu Gly Trp Phe Met Ala Leu Ser Gly Thr Tyr Phe Leu Asp Arg Ser 
115 120 125 

Lys Arg Gin Glu Ala lie Asp Thr Leu Asn Lys Gly Leu Glu Asn Val 
130 135 140 

Lys Lys Asn Lys Arg Ala Leu Trp Val Phe Pro Glu Gly Thr Arg Ser 
145 150 155 160 

Tyr Thr Ser Glu Leu Thr Met Leu Pro Phe Lys Lys Gly Ala Phe His 
165 170 175 

Leu Ala Gin Gin Gly Lys lie Pro He Val Pro Val Val Val Ser Asn 
180 185 190 

Thr Ser Thr Leu Val Ser Pro Lys Tyr Gly Val Phe Asn Arg Gly Cys 
195 200 205 

Met He Val Arg He Leu Lys Pro He Ser Thr Glu Asn Leu Thr Lys 
210 215 220 

Asp Lys He Gly Glu Phe Ala Glu Lys Val Arg Asp Gin Met Val Asp 
225 230 235 240 

Thr Leu Lys Glu He Gly Tyr Ser Pro Ala He Asn Asp Thr Thr Leu 
245 250 255 

Pro Pro Gin Ala He Glu Tyr Ala Ala Leu Gin His Asp Lys Lys Val 
260 265 270 
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Asn Lys Lys lie Lys Asn Glu Pro Val Pro Ser Val Ser lie Ser Asn 
275 280 285 

Asp Val Asn Thr His Asn Glu Gly Ser Ser Val Lys Lys Met His 
290 295 3.00 



<210> 225 
<211> 1146 
<212> DNA 

<213> Saccharomyces sp. 



<400> 225 

atgtctttta 

agcccccttt 

ctgcttcttt 

ttggaacgtt 

gtcgatgatc 

ataagatggt 

ttctcacttg 

atagatgctt 

cactctgagg 

ccatcttggg 

aattcgatga 

cccattgtag 

gattcaatgt 

ggggatcctt 

gaaaaatact 
gaggcgcaag 
agaaatgaag 
1020 

aagcggttca 
1080 

tgggcaataa 
1140 
gattga 
1146 



gggatgtcct 
ggagatttct 
tcacatgcta 
ccaaaaggga 
cgttagtttg 
ctttgggtgc 
gccaagtcct 
caataagatt 
tctcttcttc 
tccatgttta 
ggtattttaa 
taccaatatt 
ttagacaaat 
taaatgatga 
atgatcccaa 
atttaagaag 
ttcgcaaatt 



agaaagagga 
ttcatacagt 
taatgtcaaa 
aaatagaggc 
ggcaacacta 
acataatatt 
ttcaacagaa 
gttaagccct 
gctaaaaaaa 
tccagaagga 
atggggtatt 
tgctacaggg 
tctaccaaga 
tttaatcgac 
aaatcctaac 
cagattagcc 
accacgcgaa 



gatgaatttt 
acatcattac 
ttgaatggtt 
cttatgacgg 
ccatataagt 
tgctttcaaa 
agatttgggg 
gacgacactt 
gcctactccc 
tttgtactac 
accagaatga 
tttgaaaaaa 
aactttggct 
aggtatagaa 
gacctctctg 
gctgaactga 
gaccctaggt 



tagaagccta 
tgaccttcgg 
ttgaaaaatt 
tcatgaacca 
tatttacgtc 
ataaatttct 
tgggcccatt 
tagacttgga 
cgcccataat 
aattatatcc 
tcctagaagc 
tagcatccga 
ctgaaataaa 
aagaatggac 
acgaattgaa 
gagcccatgt 
tcaaatcccc 



tcccagaaga 
tgtatcaaaa 
agaaactgcc 
tatgagtatg 
tttggacaac 
ggccaacttt 
tcaaggttct 
atggacccct 
aaggtcgaag 
gccttttgaa 
aacaaagccg 
agcagtcaca 
tgttaccata 
acatttggtt 
atatggtaaa 
tgctgaaatt 
ctcatggtgg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



acaccacgga aggtaaatcg gacccagatg ttaaagtcat tggcgaaaat 
ggaggatgca aaagtttctg cctccagagg gtaaaccaaa gggtaaggat 



<210> 226 
<211> 1191 
<212> DNA 

<213> Saccharomyces sp . 



<400> 226 

atgaagcatt 

ataaaagggt 

gtcgtttttc 

ggtataaatc 

gctccctctt 

gccaagccat 

gcagactgga 

atcatcctga 

aagtttatat 

gtttctatgg 

tccaagacaa 

ctaagcctca 

gtccaattaa 

ctagctccta 

acggaatacg 

gagaaagtag 

gaagtttttt 

1020 

tactacaaca 
1080 

acgacacaaa 
1140 

gggttcttcg 
1191 



cccaaaaata 
tgcaaaggct 
agatctgtct 
aaagtaagaa 
ctttgaatgt 
gctttagatt 
tttatctctg 
agaaagctct 
ttttaagtag 
acttaaacgc 
atgaatccat 
agacaagaga 
gacatttgtt 
gtttagatgc 
tcggcaccaa 
atttttatat 
tcaattggtt 



ccgtaggtat 
gcttatcgct 
acaggtgctt 
ggcttttatc 
cacttttgaa 
taaagacagg 
gtggctttcc 
gcagtacata 
gaactggcaa 
gaggtgcaag 
tgccgcttat 
aaaaagcgag 
attaccgcac 
tatctacgat 
attcaccttg 
tagggaattt 
actgggcgtg 



ggaatttatg 
tgcttgttca 
ctcccttgga 
gttttattat 
acatcgcggc 
gctataataa 
tttgtttcaa 
ccattactgg 
aaggatgaga 
gggcccctta 
aatttaatca 
gcattctgtc 
tctaaaggct 
gtcactattg 
aagaaaatat 
agagttaatg 
tggaaagaaa 



aaaagactgg 
tttcaggctc 
gcaagattag 
gcatgatctt 
cattgaagaa 
ttgcaaatca 
atttgggtgg 
gatttggcat 
aagctttaac 
caaattataa 
tgttccctga 
aaagagcaca 
tgaagtttgc 
gatattctcc 
tcttaatggg 
agatcccttt 
aagatcaact 



taatcccttt 
gctgagtatt 
atttcaaaat 
gaacatggtg 
ctcttctaac 
tcaaatgtat 
taacgtttat 
gcgaaatttt 
aaatagtttg 
gagttgttat 
gggtacaaat 
tttggaccat 
agtagaaaaa 
cgccttgaga 
tgtctatccg 
gcaagatgac 
gctagaagac 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 



caggccaatt taaaagtaat gctaaaaatg acaaccaatc catcgttgtt 
cgactggatt tcagcacgaa 
cttttcttat tcttgtattt 



acattgacac cccgtatcct ttcatattac 
gtgatgaaaa aaaatcattg a 
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<210> 227 
<211> 1440 
<212> DNA 

<213> Saccharomyces sp . 



<400> 227 

atgggttttg 

ttagatattt 

taccttcatt 

ttctgtctta 

gagtacaaaa 

actttagaca 

accttcgaaa 

ttttttaaag 

gactacacat 

tgggaatttc 

tttcttggtt 

ttcaaagatg 

aacaatcaag 

attcaaagaa 

ccaagattta 

agaaagaaaa 

gacaaattag 

1020 

ccgcccatga 
1080 

aagcccaagg 
1140 

gtcaaatata 
1200 

tatcaattag 
1260 

gaaaacaaca 
1320 

attcaaccat 
1380 

atagaaaaag 
1440 



ttgatttctt 
ctgattggtt 
ctttttttac 
gactgcattt 
ttcggctgct 
aggtgctgga 
aaaaaaaaaa 
atagcaaatt 
tgattaatta 
tacaaaagct 
ggggaaaaat 
aaacactcgc 
ctattactat 
aattacacca 
aaaactttac 
accgtaacaa 
ttcacaagag 



cgaaacatat 
gagtctgacc 
tgcaatcaat 
actatatgac 
ctcgagggca 
ggcgattgaa 
cgtccaaatt 
ccaaaccaca 
cctttttctc 
gaggaagggg 
gtttaacttt 
actctcatcg 
ttttcccgaa 
agattttccc 
cactttgatg 
tataatcaaa 
catgaaaatg 



atggtcggtt 
ccaaggttgc 
caattcctac 
agattttggt 
ctgacatata 
atttggtttc 
ttcataaccg 
ttaatgatat 
aaaagttgtc 
gaagatctag 
cctcgattgg 
aatgagttaa 
gtcaatatca 
tttgttataa 
gctgcttttt 
gaggcccgat 
gagtcttcca 



tcgtagataa ttcatactta cttacaaaaa 



tggtacgaat 
ctgatagtgg 
agcaaatatc 
taaaggataa 
tactcgcata 
atagattaat 



caatccatac 
gcatgatcat 
tccgacaatc 
ggaccccatt 
caatgatgag 
cgagtccttg 



atatatgatg 
accaacggag 
tttgagatga 
gttgtgatgg 
agtttagaaa 
caaaaaaata 



ctagggtcca 
ttattctttt 
agttcattaa 
cgcatgtgcc 
gtaaactgaa 
agctacattt 
agggaagtga 
gtaatcatcg 
ccaccaagtt 
ctgaatggcc 
atctactaaa 
gagatatttt 
tgagtttgga 
acttctataa 
catcaattaa 
acctgtttca 
aggtatccga 

aggaagaaat 

tcaccataat 

atttgagact 

ttcaaccaga 

taaatgtaaa 

agtggcttga 

ttaaaattga 



gttcaaacag 
tggctatttt 
cacgaatt'cc 
cataataggt 
aataatacca 
agttgaaatg 
tgacctaaac 
atcagtgaat 
ttatactaaa 
tcagttaaaa 
gaacatattc 
agaaagacaa 
actatcaatt 
tttattatac 
aaacatcaaa 
cagagaactt 
taagacgacg 

cagcagcggc 

ttattaccga 

tcataaaggt 

aatggagtct 

aaagcatcaa 

aaataggtgg 

gaccaaataa 
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<210> 228 
<211> 903 
<212> DNA 

<213> Saccharomyces sp . 



<400> 228 

atggaaaagt 

acaatcagga 

accattataa 

ggtttgatat 

aaacgtgacg 

tgtacctcac 

ttggtcccat 

ctcgccggtg 

ggcaataccg 

ccgtttagta 

aaccccgcaa 

aaaactgcca 

aagggcatta 

gaattacgcg 

gttgaatcta 

tag 



acaccaattg 
aacctagtaa 
tgctaccatt 
tgaagtttac 
taaggaaatc 
ctttagatgc 
ccaatgacat 
ggttagatat 
tgaattttat 
taaccgggaa 
tggccaaaac 
tcaccacatt 
atgttaaatg 
ttgcattaaa 
agaggaattt 



gagagacaat 
ggtgatgaca 
gattatgctg 
attcagttgg 
caagcattat 
tttttcagtg 
tgtatacaaa 
aaaactctat 
gtttgctgag 
aaaacttaaa 
taaaaaattt 
gcccatctcc 
caagatcaac 
cggtggcgac 
tgtgaaggaa 



ggtacgggaa 
gcgtgtttgt 
taccttctaa 
aaagaggaaa 
ccacagaagg 
gtgttattag 
gtttccataa 
ggccacgagg 
ggtacctcat 
gaattcatag 
gaattgcaga 
aatatggagt 
gagccacaag 
aaatataaac 
tatatcagcg 



tagctccatt 
tgggtatcct 
ctggccagaa 
ttaccgtgca 
gcaagcttta 
ctcaagggcc 
gagaattcat 
tagcagagct 
gtaatggtaa 
acccttcaat 
ccatccaaat 
atttatctag 
tactctcgga 
tagtctcacg 
atcaacgtaa 



tctaccaaac 
aggggtgaaa 
caacttactg 
aggaatcaag 
tatttgcaat 
tgttacgttg 
caacttcatc 
atctcaattg 
aagcgtctta 
aaccacaatg 
caaaactaat 
atttctgaac 
taatttagag 
gaagttagat 
aaagaggaag 



<210> 229 
<211> 2280 
<212> DNA 

<213> Saccharomyces sp . 
<400> 229 

atgcctgcac caaaactcac ggagaaattt 
aattacagtt ccatcgaggc caaaagcgtc 
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gcctcttcca agagcacaca gaaaactacg 60 
aagacgtcgg ctgatcaggc atacatctac 120 
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caagagccta gcgctaccaa gaagatactt 
atcttccact gcttctttag agaaatcaga 
ggaccggtga tctttgttgc ggctccgcat 
atgggcgagg tgaagaaatc tgtcaacaga 
ttaaagcaac cccccatagg gtttttggct 
ccgcaggata atttgaaacc ggcagaaggt 
agagttatcg gccacgacac gcatttcttg 
ttacccaaat caatgggatt tggagaaatc 
ctaagaaaag agttcaaaat ggccaaacca 
acttataaat atgccgctaa agtcgaccaa 
ttggcccata acaactgcat tgggatcttt 
ttgttgcccc tgaaagcagg tgtggcgatt 
gacgtcaatg ttaagattgt tccctgcggt 
tcgagagcgg ttgttgaatt cggtgacccc 
taccacaacc cggaaacgaa cagagatgca 
1020 

ggtttacaat ccgttaccgt tacatgttct 
1080 

ataagaagac tatatatgac acaatttagc 
1140 

atgaacagaa gaatggtcaa aggttacgaa 
1200 

ttgaccaaag atataatggc atataatgcc 
1260 

cttgtggagg aggcaaaggt aaatttcgca 
1320 

atcgggctct gcatcctctt ttcgttagcc 
1380 

ttcatattag ccaagagaat ttctcaagaa 
1440 

gttaaaataa aggctaacga tgtcattgcc 
1500 

gcgcccttgc tttacatctt ttggtccgtt 
1560 

tggaataaaa tatatgtttt ttccgggtct 
1620 

gccttaatcg tgggtgatat tggtatggat 
1680 

tctcttacat ctccaaaggg cttgcaaaag 
1740 

agaataatcg aagttgtaaa taactttgga 
1800 

gccctacgtg aagaattcga cgtcatcgat 
1860 

ttgaatcgca ggaaaatgct aagaaaacag 
1920 

tcacctatca tcagccaacg tgacaaccac 
1980 

gatggcgtct cattggtcaa tagtgacaat 
2040 

' acttttcatc gtaagtcaga gtcttcctta 
2100 

tccgaatttg aggtagaaaa cgaaatcttg 
2160 

gcacaggccg tcttaaacaa gagaattggt 
2220 

gaagaagagg aagaagaaga agaggaagaa 
2280 

<210> 230 
<211> 2232 
<212> DNA 

<213> Saccharomyces sp . 
<400> 230 

atgtctgctc ccgctgccga tcataacgct 
tcccgacggt acaaaaattc atacaatgga 
gatgtgtctg tatttctgtt taatattttg 
cgtggtgcat ataacgttcc cgaagttggg 
gcaaatcagt tcatcgaccc ggctttggta 



tactccatcg ccacatggct gttgtacaac 180 
ggccggggca gtttcaaggt accgcaacag 240 
gctaaccagt tcgtcgaccc tgtaatcctt 300 
cgtgtgtcct tcttgattgc ggagagctca 360 
agtttcttca tggccatagg cgtggtaagg 420 
actatccgcg tagatccaac agactacaag 480 
actgattgta tgccaaaggg tctcatcggg 540 
cagtccatag aaagtgacac gagtttgacc 600 
gagattaaaa ctgctttact caccggcact 660 
tcttgcgttt accatagagt ttttgagcat 720 
cctgaaggtg ggtcccacga cagaacaaac 780 
atggctcttg gttgcatgga taagcatcct 840 
atgaattatt tccatccaca taagttcagg 900 
attgaaatac cgaaggaact agtcgccaag 960 
gtgaaagaat tattagatac ca^atcgaag 

gattatgaaa ctttgatggt ggttcaaacg 

accaagttac cgttgccctt gattgtggaa 

ttctatagaa acgatcctaa aatagcggac 

gccttgagac actataatct tcctgatcac 

aaaaacctcg gacttgtttt ttttagatcc 

atgccaggta tcattatgtt ctcacctgtc 

aaggcccgta ccgctttgtc caagtctaca 

acgtggaaaa tcttgattgg gatgggattt 

ttaatcactt attacctcag acataaacca 

tacatctcgt gtgttatagt cacgtattcc 

ggtttcaaat ctttgagacc actggtttta 

ctacaaaagg atcgtagaaa tctggcagaa 

agcgaattat tccccgattt cgatagtgcc 

gaagaggaag aagatcgaaa aacctcagaa 

aaaataaaaa gacaagaaaa agattcgtca 

gatgcctatg aacaccataa ccaagattcc 

tccctctcta acattccatt attctcttct 

gcttcgacat ccgttgcacc ttcttcttcc 

gaggaaaaaa atggattagc aagtaaaatc 

gaaaatactg ccagggaaga ggaagaggaa 

gaagaagaag ggaaagaagg agatgcgtag 



gccaaaccta ttcctcatgt acctcaagcg 60 
ttcgtataca atatacatac atggctgtat 120 
ttcactattt tcttcagaga aattaaggta 180 
gtgccaacca tccttgtgtg tgcccctcat 240 
atgtcgcaaa cccgtttgct gaagacatca 300 
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gcgggaaagt 
agatttatct 
aacttgaagc 
gaaatcatca 
ttttctgcca 
atcccggatg 
gtggagctct 
actttccaga 
ggtggttctc 
ctgggcgcag 
tatttccaca 
gtggatggga 
1020 

ctaaaaaaga 
1080 

ttgatggtca 
1140 

cctgccattg 
1200 

ccaagaatta 
1260 

ggtttaaaag 
1320 

tttgtaactt 
1380 

ggttctattc 
1440 

aaagagggtt 
1500 

aaacttatcg 
1560 

attattttgg 
1620 

atacaatttg 
1680 

ggtgaaatcg 
1740 

cccggtaaga 
1800 

gctgtttgta 
1860 

atattctcta 
1920 

atgagtgtac 
1980 

aacgccctat 
2040 

gcaaagcaag 
2100 

gagaaaaatc 
2160 

cgcaacacaa 
2220 

aagaaagaat 
2232 



cccgatccag 
ctttctttgg 
cagtggatga 
agggccgctc 
agtccttgct 
atgaaacgat 
tgactaatgg 
gtgttt ttga 
atgaccgtcc 
tagccgctga 
gaaataaatt 
aatatggcga 

tcaccaattc 

ttcaggctgc 

tagaaatcaa 

ttcacttaaa 

accatcaggt 

tgatcgttcg 

tcttcactcc 

taaagaaatc 

tggcgttaat 

caagaaaaca 

tctattttta 

gtgttgacct 

agatcgaaga 

acgatttagg 

agagagacgg 

aatctagaag 

caagagtgaa 

gtcaatggaa 

ctgccatagt 

atatatcttc 

ga 



aatgccttgt 
tcacgcaatg 
gaatcttgag 
caagaaccca 
tggattgccc 
aatcttgtcc 
tactaatttt 
tcacttgcat 
ttcgttacta 
tcctaccatg 
cagatctaga 
aatgtataag 

tttgttttct 

cagaagacta 

cagaaggtta 

aaaactggta 

gatgcaatta 

attgattaaa 

aattttcatt 

attggttaaa 

attggcacca 

acactattgt 

tgcgttattg 

tttcaaatct 

aatccaaaca 

acctttggtt 

ttatgatgtc 

ccgctcttct 

ttcaagaggc 

aagtgaaggt 

acaaaccgca 

gaagattgct 



tttgttactg 
ggcggtattc 
atttacgctc 
cagactacac 
gactacttaa 
tctccattca 
aaatatgcag 
acgaagggct 
cccatcaagg 
aaagttgctg 
gctgttttag 
gactccccac 

gttaccgaaa 

tatcaaccgg 

cttttcggtt 

tatgactaca 

aaaactacca 

ttttctgtct 

atttgtcgcg 

attaagggta 

attttatacg 

cgcatctggg 

gttttcacca 

ttaagaccac 

acaagaaaga 

ttccctgatt 

tcttctgatg 

tctatacatt 

tcgttgaccg 

gaaactagtg 

cgaagttctg 

tcgctggtaa 



ctgagtcgag 
ccgtgcctag 
cggacttgaa 
cagtgaactt 
gtaatgctca 
gaacatcgaa 
agaaaatcga 
gtgtaggtat 
caggtgttgc 
ttgtaccctg 
aatacggcga 
gtgagaccgt 

atgctccaga 

taaaagtcag 

attccaagtt 

acaggaaatt 

aattagaagc 

ttgctatact 

tatactcaga 

ccgatttgtt 

ttacttactc 

ttccttccaa 

cgtattcctc 

tttttgtttc 

atttaagtct 

acgataaatt 

cagagtcttc 

ctattggctc 

atattccaat 

aggatgagga 

atctaaataa 

gacagaaaag 



ttttaagaaa 
aattcaggac 
gaaccacccg 
tacgaaaagg 
aatcaaggaa 
atcaaaagtg 
caatacggaa 
tttccccgag 
cattatggct 
tggtttgcat 
acctatagtg 
ttccaaacta 

ttacgatact 

gctacctttg 

taaagatgat 

agattcagtg 

attgaggtgc 

atcgttaccg 

aaagaaggcc 

ggccacatgg 

gatcttgttg 

taacgcattc 

tttaaagacc 

tattgtttac 

agagttgact 

agcgactgag 

tataagtcgt 

gctagcttct 

tttttctgat 

tgaatttgat 

ggaaaacagt 

agaacacgaa 



360 
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<210> 231 
<211> 1194 
<212> DNA 

<213> Saccharomyces sp. 



<400> 231 

atgctgcatc aaaaaatagc tcataaagtt 
ttgattttct tccagggatg ccttattctt 
tactgtagaa atgatataag gaaacaaatt 
gtcttggtat catccatttt gcatgttgtc 
aattccagtg ttcctaaagg tacttttttt 
catctaaagt ccaattcggt ggccatttgc 
ttatggtggt tggcttacac atcgaactta 
tcgttggctt ccattcctat cctcggtttc 



cgaaaagtcg tcgtcccagg tatttcctta 60 
ttgtttctcc aactcaccta taagactctt 120 
ggtctcaata aaaccaaaag attatttatt 180 
gcaccatctg cagtgagaat taccactgaa 240 
ttagacttga agaagaaaag gattctttct 300 
aatcaccaaa tatacacgga ttggatattt 360 
ggggctaatg tcttcattat tttaaaaaaa 420 
ggtatgagaa actataattt catttttatg 480 
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agtagaaagt 
aatgcaaggg 
gagagcatat 
atcctattcc 
gctgccaaaa 
agatactcgt 
tactccggtg 
ttagaaggaa 
attccattag 
1020 

gatgctctaa 
1080 

cattcagtta 
1140 

ccaactctaa 
1194 



gggcacaaga 
gcgccggctc 
ggaatccgga 
ctgaaggtac 
taggcaaaaa 
tacaaaagtt 
taaaacagga 
aatacccgaa 
aggacgagaa 



caaaataacc 
acttgctgga 
ggttattgat 
aaatctcagt 
gccattcaag 
gaagccaagt 
ggaatatggt 
gttagtcgat 
tgaattttca 



ctaagcaaca 
aagtcacctg 
ccaaaacaaa 
gctgatacta 
aatgtgctac 
attgaaagtc 
gagcttatat 
attcacatca 
gaatggctgt 



gccttgctgg 
agcgcataac 
tccattggcc 
ggcaaaaaag 
tgcctcattc 
tttatgatat 
atgggctgaa 
gagcatttga 
ataaaatttg 



ccttgattcg 540 
tgaggaagga 600 
atacaatctt 660 
tgctaaatat 720 
tacaggccta 780 
tacgatcggc 840 
gagcatattt 900 
tgttaaagat 960 
gagtgagaag 



tggaaaggta ctattccact ggatcattcg taagtgatcc tgaaacaaac 
ccgatagttt caagatcaat cgtattgagt taactgaagt gctaatatta 
caataatttg gttagtttat aaactttatt gttttatttt ttqa 



<210> 232 
<211> 912 
<212> DNA 

<213> Sac char omyces sp . 



<400> 232 

atgagtgtga 

gcaggctgtg 

catttggctc 

cttgacgtca 

aatcaccaat 

gttactgcca 

ggtacatatt 

ttagaaaatg 

tacacgagtg 

ggtaagatcc 

tatggggtct 

aacttaacaa 

actttgaagg 

attgagtatg 

gtgccttctg 

aagatgcatt 



taggtaggtt 
gcttttacgg 
agtggattac 
aggtcgttgg 
ccaccttgga 
agaagtcttt 
tcttagacag 
ttaagaaaaa 
agctgacaat 
ccattgttcc 
tcaacagagg 
aggacaaaat 
agattggcta 
ccgctcttca 
tcagcattag 
aa 



cttgtattac 
tgtaatcgcc 
tgcgcgttgt 
cgaggagaat 
tatcttcatg 
gaaatacgtc 
atctaaaagg 
caagcgtgct 
gttgcctttc 
agtggttgtt 
ctgtatgatt 
tggtgaattt 
ctctcccgcc 
acatgacaag 
caacgatgtc 



ttgaggtccg 
tctatccttt 
ttttaccatg 
ttggccaaga 
ttaggtagga 
ccctttctgg 
caagaagcca 
ctatgggttt 
aagaagggtg 
tccaatacca 
gttagaattt 
gctgaaaaag 
atcaacgata 
aaagtgaaca 
aatacccata 



tgttggtcgt 
gcacgttaat 
tcatgaaatt 
agccatatat 
ttttcccccc 
gttggttcat 
ttgacacctt 
ttcctgaggg 
ctttccattt 
gtactttagt 
taaaacctat 
ttagagatca 
caaccctccc 
agaaaatcaa 
acgaaggttc 



actggcgctt 
cggtaagcaa 
gatgcttggc 
tatgattgcc 
tggttgcaca 
ggctttgagt 
gaataaaggt 
taccaggtct 
ggcacaacag 
aagtcctaaa 
ttcaaccgag 
aatggttgac 
accacaagct 
gaatgagcct 
atctgtaaaa 
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<210> 233 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 



<400> 233 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaa 

<210> 234 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 234 

tcgaggatcc gcggccgcaa gcttcctgca gg 



<210> 235 
<211> 32 
<212> DNA 
<213> Artificial 



Sequence 



<220> 
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<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 235 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 236 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 236 

tcgacctgca ggaagcttgc ggccgcggat cc 



<210> 237 
<211> 32 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 237 

tcgaggatcc gcggccgcaa gcttcctgca gg 



<210> 238 
<211> 36 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 238 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 



<210> 239 
<211> 28 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 239 

cctgcaggaa gcttgcggcc gcggatcc 



<210> 240 
<211> 36 
<212> DNA 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 240 

tcgacctgca ggaagcttgc ggccgcggat ccagct 



<210> 241 
<211> 28 
<212> DNA 
<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequence : Synthetic 
Oligonucleotide 

<400> 241 

ggatccgcgg ccgcaagctt cctgcagg 
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